TRANSACTIONS 
OF THE 


AMERICAN MATHEMATICAL SOCIETY 


EDITED BY 


A. A. ALBERT EINAR HILLE OSCAR ZARISKI 


WITH THE COOPERATION OF 


RICHARD BRAUER NELSON DUNFORD W. K. FELLER 

T. H. HILDEBRANDT S. C. KLEENE M. S. KNEBELMAN 

R. E. LANGER SAUNDERS MacLANE OYSTEIN ORE“ 

H. P. ROBERTSON J. J. STOKER D. J. STRUIK 

GABOR SZEGO HASSLER WITHNEY R. L. WILDER 
VOLUME 53 


JANUARY TO JUNE, 1943 


PUBLISHED BY THE SOCIETY 
MENASHA, WIS., AND NEW YORK 
1943 


BOSTON UNIVERSITY 
COLLEGE OF LIBERAL ARTS 
LIBRARY 


| 

| 

| 

| 

| 
| 

| 

| 

| 

| 


\ 


‘Bew 1443 


(Composed, Printed and Bound by 
Prove 
George Banta Publishing Company 
* Menasha, Wisconsin 


QA 
\ 
F 


ool 
we. 53 


TABLE OF CONTENTS 
VOLUME 53, JANUARY TO JUNE, 1943 


Apams, C. R., and Mors, A. P. On approximating certain integrals by 


ALLENDOERFER, C. B., and Wei, A. The Gauss-Bonnet theorem for 
Riemannian polyhedra 

BaRTELs, R. C. F. Torsion of hollow cylinders 

BECKENBACH, E. F., and READE, M. Mean-values and harmonic poly- 
nomials 


Cosurn, N. Congruences in unitary space 

HERZBERGER, M. Direct methods in geometrical optics 

Jounson, R. E. On structures of infinite modules 

KERSHNER, R. The continuity of functions of many variables 

KLEENE, S. C. Recursive predicates and quantifiers 

LanGER, R. E. A theory for ordinary differential boundary problems 
of the second order and of the highly irregular type 

Loomis, L. H. The converse of the Fatou theorem for positive harmonic 
functions 

MERsMAN, W. A. Heat conduction in an infinite composite solid with an 
interface resistance 

Morsg, A. P., and Apams, C. R. On approximating certain integrals by 


OLDENBURGER, R. The characteristic of a quadratic form for an arbi- 
trary field 

READE, M., and BECKENBACH, E. F. Mean-values and harmonic poly- 
nomials 

REICHELDERFER, P. V. On bounded variation and absolute continuity 
for parametric representations of continuous surfaces 

Ritt, J. F. Bézout’s theorem and algebraic differential equations... . 

SaLeM, R. On some singular monotonic functions which are strictly in- 
creasing 

ScHaTTEN, R. On the direct product of Banach spaces 

SzAsz, O. On the partial sums of Fourier series at points of discontinuity 

SzEeG6, G. On the oscillation of differential transforms. IV. Jacobi poly- 
nomials 

Wet, A., and ALLENDOERFER, C. B. The Gauss-Bonnet theorem for 
Riemannian polyhedra 

Wong, Y. C. Some Einstein spaces with conformally separable funda- 
mental tensors 

ZaRIsKI, O. Foundations of a general theory of birational correspond- 


99 | 
Cap . 
. 
| 
101 
230 
BERGMAN, S. Linear operators in the theory of partial differential equa- 
25 
218 
469 
83 
41 
292 
239 
| 
14 
454 | 
230 
74 
427 
195 
440 
463 
101 
157 
| 


. 
- 


TORSION OF HOLLOW CYLINDERS 


BY 
R. C. F. BARTELS 


1. Introduction. The torsion problem for the (solid) cylinder whose cross 
section is a simply connected region has received considerable attention in 
recent literature. Outstanding among the published works which emphasize 
methods are the investigations by: Trefftz [4] and the generalizations of his 
method by Seth [7]; Muschelisvilli [5] and the applications of his method by 
Sokolnikoff and Sokolnikoff [7]; and more recently Stevenson [8] and the 
extension of his method by Morris [4]. 

The torsion problem for the (hollow) cylinder whose cross section is a 
doubly connected region, on the other hand, has not enjoyed such propitious 
attention. The present analytical methods of treating this form of the prob- 
lem have been improved very little since the close of the nineteenth century 
when Macdonald [3] obtained a solution for the region bounded by eccentric 
circles making use of curvilinear orthogonal coordinates; the solution of the 
torsion problem for the region bounded by confocal ellipses was published by 
Greenhill [1] several years earlier employing the same method. It should be 
remarked that the experimental methods—for example, the membrane anal- 
ogy which was pointed out by Prandtl [12] and later improved by Griffith 
and Taylor [13], and Trayer and March [14]—are readily extended to the 
case of multiply connected regions. 

The purpose of this paper is twofold: first, to supply the need for a general 
method of obtaining a computable solution of boundary value problems of 
Dirichlet type for doubly connected regions, and second, to apply this method 
to obtain the solution of the torsion problem for certain hollow cylinders. 

The procedure of determining the solutions of the torsion problem for the 
doubly connected regions considered is in each case to map the region con- 
formally upon an annulus, and then to solve the related Dirichlet problem for 
the simpler region. To this end a formula for the solution of the general 
Dirichlet problem for the annulus is developed which, though lacking the 
elegance of the well known integral formula of Villat [10], lends itself readily 
for purposes of computation. 

2. Solution of the problem for the annular region. Let 7; and 72 denote 
the circles |{| =1:, and |{| =12, r1<12, respectively, in the plane of the com- 
plex variable ¢. Also, let the real functions and u2(o2), where o; = e*, 

‘¢2=ree* (@ real), be periodic and continuous for all values of ¢ with period 
2x and such that 


Presented to the Society, April 8, 1938, under the title Saint-Venant's flexure problem for a 
regular polygon; received by the editors fuly 14, 1941. 


1 


BUSILA UNIVERSITY 
COLLEGE OF LIEEXAL ARTS 
LIBRARY 


| 
| 
| 
| 


R. C. F. BARTELS : (January 


1 u 1 ¢* 
(1) do; = wer =A Gj = 1, 2), 


A representing the common value of the integrals. Then the function(*) 


(2) = doz — Uikev) do; + const., 


where the functions U;(0:) and U2(¢2) are defined by the integral equations(*) 


(3) uj(o;)) = Ufo)) +R {~ f ies} Gj, k= 1, 2;7 ), 
Tid 4,92 — 

is single-valued and regular for r1<|{| <r2 and, except for an additive con- 
stant, its real part takes on the values 1(0:), u2(o2) on the circles 71, y2, re- 
spectively. The existence of the functions U;, U; has been established for func- 
tions 1, u satisfying much more general conditions than those considered 
here. It is well known that the condition (1) is both necessary and sufficient 
in order that the function f(¢) determined by the values u:(01) and u2(¢2) pre- 
scribed on 7; and 2, respectively, be single-valued. 

If the functions ~ and 12 are replaced by two new functions satisfying (1) 
and differing from “; and uz by constants, the function f(£) is altered only by 
the addition of a constant. It can therefore be assumed, without restricting 
the application of the formula (2), that the constant A in (1) is zero. In this 
event, 

(4) U(o;)do = 0 (Gj = 1, 2). 
0 

In addition to the conditions given above, let «;(0:) and ue(¢2) be ab- 
solutely continuous functions of @ in the interval 0<@S27. Then U;(o;), 
U2(e2) are also absolutely continuous functions of ¢ in the same interval. 
Under these conditions the infinite series 


(i) —m G) —m 


(5) uo) = > + ane; Udo) = > + Ane; 
m=1 
G i, 2); 


1 U; 
7 


opt 
(j=1, 2; m=1, 2, - ‘); 
(*) Cf. G. C. Evans, The logarithmic potential, Amer. Math. Soc. Colloquium Publications, 


vol. 6, 1927, pp. 112-117. , 
(?) The symbol R{ F} is understood to:meah the real part of F. 


2 
where 


1943] .. ‘ TORSION OF HOLLOW CYLINDERS 3 


converge -uniformily for all values of ¢; the constant terms corresponding to 
m=O are absent from these ‘series as a consequence of equations (1) and (4). 
Therefore, by, equations (2) and @), it follows that, n< | | < 


(a) 


(7) = + const. 


Also, on substituting the series (5) in © and equating coefficients, it is seen 


that 
2m (2) 2m (1) 2m (a) 2m (2) 
(2) T2 Gm — 11 Om T2 Gam — 71, 


A, = 


Since 7;<12, the coefficients in (8) can be written in the form.of absolutely 
convergent infinite series as follows: 


An = Pas Am = om + 


where 

(9) bn = < 1. 
Let these be substituted in the infinite series in (7), which also converges 
absolutely when 1;<|{| <r. Then, ‘with the aid of the first of equations (6) 
and equations (1) in which A =0, and by interchanging the order of summa- 
tion with respect to m and n, it follows, after rearranging terms, that © 


The regularity of the function f(¢) defined by the infinite series in (10) is 
easily established under less restrictive conditions on the functions 1(0;) and 
u2(o2) with the aid of the following interesting lemma: 


Lemma. Let F(z) be regular and | F(z)| <.M for |z| <r, and let F(0).=0.. Then 
= infinite series )~_,F(q"z), where 0 <q <1, converges uniformly and absolutely 
or |z | <r and, consequently, defines an analytic function which is regular for 
ia Sr. If F(z) ts regular and | F(s)| <M for E >r, and if F(o)=0, then the 
prea conclusions hold for the infinite series ) n-iF(q"%), where g>1, when 


The lemma follows at once: from the lemma of Schwarz(*). For, applying 
the latter, aus 


(*) Cf. E. F. Titchmarsh, The theory of functions, 2d edition, London, 1939, p. 168. > 


~ 
as 


R. C. F. BARTELS 


aml n=l q 

whenever |z| <r. Hence the series converges absolutely and uniformly if 
|s| <r. The second part of the lemma is proved in like manner. 


Equation (10) can be written in the form 


(11) = doz — do, + + const., 

where 2({) represents the infinite series obtained from that in (10) by omit- 
ting the term corresponding to »=0. The application of the lemma to the 
infinite series 2(¢) is at once evident with the aid of the following inequalities 
which, in view of the inequality (9), are seen to hold for n 21 and 7, S | ¢| Sr: 


(12) | peat | > 


If «:(0;) and u2(o2) are merely bounded and integrable functions of ¢ in the 
interval (0, 2x), the terms of the infinite series 2(¢) for 21 are bounded and 
regular when lg] Sr, and, as a consequence of equations (1) with A =0, 
vanish for { =0. The terms corresponding to m < —1 are bounded and regular 
for |¢| 27: and are seen to vanish for {= «©. Therefore by the lemma the 
series converges absolutely and uniformly for Hence the 
function is regular for | | and continuous on the circles and 72. 

It will now be shown that the real part of the function f(¢) defined in the 
equation (10) takes on appropriate values on the circles +; and 2. For, if 
t:=r.e" represents an arbitrary value of { on 72, it is readily verified using 
equation (1) with A =0 and elementary manipulations that 


wid — $2 mid — pits 


= — + aos} 


at $2 


doy, 


where the bar denotes the conjugate values. Therefore 
1 
= R{— f aes} 
wid y,01— $2 
As a consequence of this, it follows from (11) that 


{ 1 


wid 02 — 


ao} + real const. 


lim R{f($)} = lim R 


Under the assumption that u(¢:) be bounded and integrable, the limit on the 


4 [January 


1943] TORSION OF HOLLOW CYLINDERS 5 


right-hand side of the last equation exists and is equal to u2(¢:) if {2 is a point 
of continuity of 12(o2)(*). 

The behavior of the function f({} on the circle 7; can be studied in like 
manner. The results of the foregoing investigation can be stated as follows(*) : 
Let the functtons u,(o1) and u2(o2), satisfying (1) with A =0, be bounded and 
integrable with respect to p in the interval (0, 2x). Then the analytic function f(<) 
which is single-valued and regular in the annular region between the circles y1 
and 2 and whose real part takes on the values u;(o1) and u2(o2) at all points of 
1 and 2, respectively, at which these functions are continuous is given by the 
formula in equation (10). 

Formula (10) will be seen to form the basis for a very general method of 
treating a class of boundary value problems related to doubly connected 
regions. In the following sections this formula is applied to the problem of 
the torsion of hollow cylinders. 

3. Statement of the torsion problem. In the theory of elasticity the St. 
Venant’s torsion problem for a region D of the xy-plane may be formulated 
as that of determining an analytic function F(z) of the complex variable 
2=x-++4y which is single-valued and regular within D and such that at points 
of its boundary C(*) 


(13) 2R{F(z)} = x? + y? + const. 


Given the function F(z), the important physical quantities for a twisted, 
homogeneous, cylindrical beam whose cross section has the shape of the re- 
gion D can be determined. For example, the well known formulas for the 
shearing stresses X, and Y, at a point in any cross section can be written in 
the form 


Y,+ iX, = — F(s)], 


where r is the “twist,” and G is the slide modulus of the material constituting 
the beam. Its torsional rigidity J is readily written in the form 


J =GIy— f [F(2) — F(z) ]d(x* + y*), 


where J, is the moment of inertia of the region D with respect to the origin, 
and the integral is taken over the complete boundary C. Making use of (13) 
and the fact that F(s) is single-valued, the latter can be written 


J =GIy+ i f Faro} 


(*) Evans, loc. cit., pp. 39, 65. 
(*®) Compare with Villat [10]. 
(*) Cf. Love (2, p. 314]. 


R. C. F. BARTELS’ [January 


In the present paper D is taken-as a doubly ‘connected region bounded 
internally and externally by the closed Jordan curves 'C; and C:, respectively. 
In view of the significance of equation (1) of the preceding section, it is evi- 
dent that the values of the constants in (13) on the curves Cj and C; are not 
independent; otherwise the function F(z) determined by the boundary condi- 
tion is not necessarily single-valued in D. anemccag apart crete this restric- 
tion, the constants are arbitrary. 

Let the function z=w(f) map D one-one acid duidhaiasiiitap on the interior 
of an annular region bounded by two circles | ¢| =r, and \¢ | =f, 11 <2, in the 
plane of the complex variable ¢. As before, these circles are denoted by yi and 
‘2 and the values of ¢ on them by o1=7e* and o,=1r2e%, respectively. It is 
understood, of course, that the radii of the circles 7; and 2 are not independ- 
ent, the ratio 7:/rz being determined uniquely by the region D("). Moreover 
the mapping is known to be topological on the owintnny, so thet the func- 
tion is continuous on ¥; and 72. 

Given the mapping function z=w({), the torsion sities for the doubly 
connected region D can be transformed into a corresponding boundary value 
problem for an annulus whose solution is given in the foregoing section. For, 
if f(¢) represents the values of F(z) in the {-plane, that is, f(¢) = F[w(¢)], then 
is single-valued and regular for and, according to (13), its 
real part satisfies the conditions 


w(o;)w(o 
(14) Ri f(o,)}-= = + G = 1,2), 
where the constants ¢, ¢; are taken so that equations (1). with A =0 are satis- 
fied. Since w({) is continuous on 7; and 72, the functions (01), u2(o2) are 
continuous on 7, Y2, respectively. Thus the function f(z) satisfies the condi- 
tions of the theorem stated at the close of the preceding section and, there- 
fore, is given by the formula (10). 

The function F(z) representing the solution of the torsion problem for 
the region D is obtained from f(¢) by making the inverse of the transforma- 
tion s =w(¢). On the other hand, this process of inversion is more or less super- 
fluous since the important physical quantities for the cylindrical beam with 
cross section D can easily be expressed in terms of the function f(f). Thus the 
shearing stress at a point z of the cross section is given by 


where ¢ is the point of the annular region in the {-plane corresponding to z 


Y,+ ix,= ra| at 


(7) For particulars on the mapping of multiply connected regions, see C. Carathéodory, 
Conformal representations, Cambridge Tracts in Mathematics and Mathematical Physics, no. 
28, London, 1932, pp. 70-73. 


1943] TORSION OF HOLLOW CYLINDERS 7 


under the transformation z=w(¢). Moreover the torsional rigidity J is given 
by 


J =GIo+ f 


where the integral is taken over the complete boundary of the annulus in the 


positive sense. 
4. Torsion of the eccentric ring. Let 


f+e 
1 + af 


where a<1. Then the region D of the z-plane corresponding to the annular 
region r< | c | <1 of the {-plane is that which is bounded internally and ex- 
ternally by the circles |s—c| =R and |s| =1, respectively, where 

a(i — r?) r(i — a?) 

R = 

1 — a*r? 1 — a*r? 


(15) s= w(t) = 


This mapping of the region D upon the annulus is such that the points of the 
circles 7:( lt | =r) and v1 | =1) correspond, respectively, to points of the 
circles |z| =1 and |s—c| =R. 

Since in this case o2¢2 = 1, it follows that w(¢2)w(¢2) =1. Also, by the theory 
of residues, it is seen that 


do, = 2ri(1 — 


f w(o1)w(o1) 


1 
where 


(1 — a*)(1 — r’) 
(1 — 
Accordingly, the functions u;(¢;) and u2(¢2) in the boundary condition (14) 
are given by 


It follows from the second of equations (17) that the first integral of each 
term of the infinite series in (10) vanishes. Further, with due regard for the 
inequalities in (12) in which »,=r, a simple evaluation of residues gives, 
for r<|¢| <1, 


(16) h= 


when «> 0, 
f 1 + apg 


: 


8 R. C. F. BARTELS ; [January 


Therefore, in accordance with (10), the function f(¢) which is regular for 
r< lt | <1 and satisfies the condition (14) on the circles y; and yz can be 
written in the form 


(18) f(s) = — L(1/s)] + const., 


where, 


= 
(3) 
By a transformation(*) of the Lambert series L(¢) in (18), the function f(¢) 
can also be written in the form(*) 


ahr? 


+ h[P(s/r) — P(r/$)] + const. (r <|¢| < 1), 
ar?+¢ 


(19) {) = 


where 


P(t) = >> 1)" 


n=l 1 


The latter form has the advantage over the former from the point of view 


of computation. 
5. Torsion of a hollow lune. Let 


_ (1 one 


(20) = = (l¢] <2), 


where that branch of the square root is chosen which has the value +1 when 
¢=0. Then the annulus r< | c | <1 of the {-plane corresponds to the region D 
of the z-plane which is bounded externally by arcs of equal radii intersecting 
at right angles in the points z= +1, and internally by the oval 


The region D is an approximation of the cross section of a common type 
of hollow strut used in aircraft construction. By choosing values of r suffi- 
ciently near unity, the comparison can be extended to thin cylindrical shells 
whose sections are in the shape of the two intersecting circular arcs determin- 
ing the external boundary of D. 

If as in the preceding section o, = re, it follows from (20) and the theory 
of residues that, for r>0, 

(*) Cf. K. Knopp, Theory and application of infinite series, English translation by R. C. 


Young, London, 1928, p. 452. 

(*) The form of the solution given by (19) corresponds to that which was obtained by 
Weinel with the aid of dipolar coordinates; see [11, p. 70]. The method employed in the present 
paper is certainly more direct than that used by Weinel. Macdonald’s [3] form of the solution 
can be found in Love [2, p. 320]. 


1943] TORSION OF HOLLOW CYLINDERS 7 


1 
(21) do, = — Jt) + { 


1 G1—~ 


— (1 — when | <r, 
1 — when |u| 


where 


1—o)(1—¢ 


o1— 


these integrals are taken around the circle 7; (|{| =r). The integral J(u) can 
evidently be written in the form 


r— 
+ u*) — cos 


0 (r? 
where 
2r 


k= 
1+ 


Therefore(**), if 0<r <1, 
r? = 


J(u) = (14+ + (1479 


B 


where K, and E£;, are the complete elliptic integrals of the first and second 
kinds, respectively, a is the elliptic integral of the first kind defined as follows 
+ 1?) 
and Z(a, k;) is the Jacobi zeta-function; these elliptic integrals all have the 
modulus 
In particular, if 1 =0, equations (21) and (22) give 


1 w(o 1)@(o1) 


o1 


+ 
Z(a, ki) for 


(24) sn (a, k:) = 


1 
do, [2(1 + rl, 
ar? 


and consequently the constant ¢c; in the definition of the function u(¢;) in 
(14) becomes 


(?*) For particulars on the evaluation of the integral J(u), see E. T. Whittaker and G. N. 
Watson, Modern analysis, 4th edition, London, 1935, pp. 499, 518, 522. 


10 R. C. F. BARTELS 


1 
(25) — — 2(1 + 


Making use of an elementary quadratic transformation("), the complete 
elliptic integrals K, and EZ, with modulus k; =2r/1+1r? can be expressed in 
terms of the complete elliptic integrals K and E with modulus r? as follows 


2E 
K,= (1 + and £, = (1 


Also by the same transformation, for |u| <r, 


‘ 
[za u*)(r* — 


r+ 


Z(a, ki) = 
») 


where 
sn (8, 72) = 


whereas for | z| >r 


Z(p’, r? 
[20 + 1?) 


Z(a, ki) = 
(a, Ri) i+? 


where 
sn r?) = 1/p. 


Consequently these equations, together with (14), (21), (23), and (25), give 


t41(01) 


do, 


(26) 


Z(8+K, r*) for | u| <r, 


Z(6’+K, r*) for | >r, 


1 
M = — [x + — r)K — 2E]. 


In view of the inequalities in (12) in which p, =r**, equations (26) give, 
for r<|¢| <1, 


(4) Cf. H. Hancock, Theory of elliptic functions, vol. 1, New York, 1910, p. 250. 


[January 
2 
where 


TORSION OF HOLLOW CYLINDERS 


1 


)(1— Pat )) 


Z(6.+K, 


when n21, 
1/2 


—M+ i- l—n Z 


when 
where 
sn (Bn, 77) = and sn (By, 1°) = 


If.og=e* and represents the circle = 1, it follows easily from (21) 
and on r=l, “that fow 


— dog=—+ log (—*)+ 
1— (1—1/?)*/2—1 for | >1. 


The constant -c; in the definition of the function. #2(¢2) in (14) is therefore 


Consequently these equations, together with the inequalities in (12), give, 
for.r< 
1 us(o2) 


2 1~ pt 1+ pif 
(28) 1-2 -a- Pat’) 
- Thus, by equations (10), (27), and (28},-the function f(¢) which is regular 


for r<|{| <1 and satisfies the condition (14) on the circles y: and 72 can, 
after proper rearrangement of terms, be written in the form 


for 


orn > 0. 


“+ 


(29) 


1943}. 11 

pf 

2 
+ [7.@) Ta(/s)] + const, 


12 R. C, F. BARTELS 


where 


1-r 
= —— (1 


1 as 1 + 1 1/2 


and 
sn (8,, 77) = 
Since en (0, r?)=0 and Z(K, r*) =0, it is at once evident that 
tim [T4(¢) — T4(1/3)] = 0 


for n21. Consequently as r tends to zero the function f(f) defined in (29) 
reduces to the solution of the torsion problem for the simply connected region 
bounded by two circular arcs of equal radii and intersecting at right angles 
at the points z= 

BIBLIOGRAPHY 


1. A. G. Greenhill, Fluid motion between confocal elliptic cylinders and confocal ellipsoids, 
Quart. J. Math. Oxford Ser. vol. 16 (1879) pp. 227-256. 

2. A. E. H. Love, A treatise on the mathematical theory of elasticity, 4th edition, London, 
1927, pp. 310-328. 

3. H. M. Macdonald, On the torsional strength of a hollow shaft, Proc. Cambridge Philos. Soc 
vol. 8 (1893) pp. 62-68. 

4. R. M. Mooris, The internal problems of two dimensional potential theory, Math. Ann. 
vol. 117 (1939) pp. 31-38; and Some general solutions of St. Venant's flexure and torsion problem. 
I, Proc. London Math. Soc. (2) vol. 46 (1940) pp. 81-98. 

5. N. MaschelisVilli, Sur le probleme de torsion des cylindres elastique isotropes, Rendiconti 
della Reale Accademia die Lincei (6) vol. 9 (1929) pp. 295-300. 

6 B. R. Seth, On the general solution of a class of physical problems, Philosophical Magazine 
(7) vol. 20 (1935) pp. 632-640. 

7. E. S. Sokolnikoff and I. S. Sokolnikoff, Torsion of regions bounded by circular arcs, Bull. 
Amer. Math. Soc. vol. 44 (1938) pp. 384-387. 

8. A.C. Stevenson, Flexure with shear and associated torsion in prisms of uni-axial and asym- 
metric cross-sections, Transactions of the Royal Society of London (A) vol. 237 (1938) pp. 161- 
229. 

9. E. Trefftz, Uber die Torsion prismatische Stabe von polygonalem Querschnitt, Math. Ann. 
vol. 82 (1921) pp. 97-112. 

10. H. Villat, Le probléme de Dirichlet dans une aire annulaire, Rend. Circ. Mat. Palermo 
vol. 33 (1912) pp. 134-173; p. 147. 

11. E. Weinel, Das Torsionsproblem fiir den exzentrischen Kriesring, Ingenieur Archiv 
vol, 3 (1932) pp. 67-75. 

12. L. Prandtl, Zur Torsion von prismatischen Stében, Physikalische Zeitschrift, vol. 4 


(1903) pp. 758-770. 
(#2) Cf. Sokolnikoff and Soknolnikoff [7, p. 386]. 


1943] TORSION OF HOLLOW CYLINDERS 13 


13. A. A. Griffith and G. I. Taylor, The use of soap films in solving torsion problems, Reports 
and Memoranda, (n.s.), no. 333, June 1917. Technical Report of the Advisory Committee for 
Aeronautics (British) for 1917-1918, vol. 3. 

@ 14. G. W. Trayer and H. W. March, The torsion of members having sections common in 
aircraft construction, National Advisory Committee for Aeronautics, report no. 334, 1930. 


Tue UNIVERSITY OF MICHIGAN, 
Ann ArRBor, MIcH. 


HEAT. CONDUCTION. IN AN INFINITE. COMPOSITE SOLID 
WITH AN INTERFACE 


BY 
W. A. MERSMAN 


1. Introduction. The purpose of this paper is to solve the problem of one- 
dimensional heat conduction in a body composed of two plane-boundary 
semi-infinite homogeneous solids of different materials in “imperfect contact” 
along their interface. (“Imperfect contact” is defined by equation (4) below.) 
The Laplace transformation method is used both in discovering and in rigor- 
ously establishing the solution, by means of the inversion theorems of 
Churchill and Doetsch('). The question of uniqueness is not considered here. 

2. The boundary value problem. Let ¢ denote time, x the perpendicular 
distance from the interface, a, and k, the thermal diffusivities and conductivi- 
ties, respectively, of the two materials, and A the interface resistance(*). 
Throughout, f(x) is a known function, integrable over any finite interval, 
such that 

| f(x) | < a exp (8| 


where a and 8 are non-negative constants. If U(x, t) is the temperature, we 
have the following boundary value problem(*): 
aU 


(1) — = 4,— t>0, «<0, 
ot 


(2) lim U(x, é) = f(x), x € 0, 
au 


(3) i = lim kz t>0, 
Ox 


(4) U(+ 0, — U(— 0,4) = limdAk, — >» i> 0. 


Ox 


The constants, a,, k,, and X are all assumed to be greater than zero. 


Presented to the Society, April 11, 1942; received by the editors November 22, 1941. 

(*) Cf. Churchill, The solution of linear boundary-value problems in physics by means of the 
Laplace transformation. Part I. A theory for establishing a solution in the form of an integral, for 
problems with vanishing initial conditions, Math. Ann. vol. 114 (1937) pp. 591-613. This paper 
will be designated by [C]. 

Cf. Doetsch, Theorie und Anwendung der Laplace Transformation, Berlin, 1937. This book 
will be designated by [D]. 

(*) Cf. Riemann-Weber, Die Partiellen Differential-Gleichungen der Mathematischen Physik, 
5th edition, 1912, vol. 2 pp. 85, 100. 

(*?) Throughout, »=1 if »=2 if x>0. 


14 


| 


HEAT CONDUCTION 


Define ®,(¢) as follows: 
(S) = U(-0,), = 0,9, 


Then the solution can be written by means of @ in the well known form(*) 


(6) U(2, ) = Vile) + f — Bde, 


where | 
(7) W(x, 2) = x exp (— x*/40)/2(x#*)"?, 


1 0 
(8) ) = f {O(x — gai) — O(x + Ears) } ab, 


(9) O(z, t) = (xt)? (— x*/42), 


and V; is obtained from V; upon replacing a; oy a: and integrating from 0 
to 

Equations (3)—(6) now give two chianibiiiienes integral equations for the 
unknown functions ®,(#). These will be solved by the method of the spgtene 


transformation. 
3. The transformed problem and its solution. Throughout we use the 


singly-infinite Laplace transformation: 


L{ U(x, t)} = e~*'U(x, = u(x, 5). 
0 
For the sake of definiteness let s be any complex number whose real part is 
greater than 326%. Denoting the transforms of U, V, and so on, by u, v, and 
so on, respectively, the transforms of equations (3), (4), (6), are: 
Ou(x, Ou(x, s 


2-0 Ox z>+0 Ox 


du(x, 5) 
(4’) $2(s) — oi(s) = lim dk, 
: Ox 
(6)() u(x, s) = exp [—| x| (s/a,)"/*] + 0,(x, s). 


Eliminating ¢, from these equations we obtain the transformed solution: 


(10) u(x, s) = y,(x, 5) + w,(x, 5) 


(*) Cf. H. S. Carslaw, The mathematical theory of the conduction of heat in solids, 2d edi- 
tion, London, 1921, §§18, 23. 
__ (®) The first term of the right member ‘is obtained by the “Faltung” rule. Cf. [D, chap. 8]. 
For the specific transformations used, see the table in [D, Appendix 2]. 


15 


W. A. MERSMAN 


0 
-exp si/2| — | 
f exp [-| + (s/o lath, «<0, 
(12) A = + B = 
w;(x, s) = f {exp [— | x — (s/a:)"?] 
+ exp [—| «+ &| (s/a:)"*]} dé, «<0, 


and ys, w: are obtained from 4, wi, respectively, upon interchanging the sub- 
scripts 1 and 2 and the integrals fy and {°.. 

4. The inversion problem. It is easily seen that the operations 0/0x and 
lim,..o are interchangeable with the integrations in (11) and (13), and hence 
that (10) is an actual solution of the boundary value problem (3’) and (4’). 
To show that the inverse Laplace transform of u(x, s) exists and is a solution 
of the original boundary value problem, we use the inversion theorems of 
Doetsch and Churchill. For the inverse we use the notation: 


(13) 


1 
E“{ u(x, s)} = lim — e“*u(x, s)ds; 
oo 

that is, L~' always means this particular form of the inverse, whether or 
not it is true that u(x, s) =L{Z-"[u(x, s)]}. Throughout,  >326%, R(s/*) 20. 

Note that y and w are composed of two types of integrals, according as x 
and £ have the same or opposite signs. It is sufficiently general to consider 
the following: 


(14) a(x, 8) exp [- + 


(15) g(x, s) = f f(€) exp [- | s'/2] dé, x20. 
0 


We begin with some inequalities. Noting that |e*| =exp [R({)] and that 
R(s¥/2) = [(R(s)+|s|)/2]”?, the following are easily obtained: 


(16) | s*2(x, s)| s|*-*/? exp [— x(| s| /2)"], nz0. 


| "g(x, 5) 


S 5 n= 0, 1. 
x” 


(17) 


16 [January 
where 


HEAT CONDUCTION 


| exp [és «| | dt 
(18) & 
S 2*/%q| exp [ty — &:(| s| /2)"/2/4], 
y¥=R(s), 8 > 4220, n=0,1. 
By means of these inequalities the following lemmas can be proved: 


Lemma 1. If 0<xiSxSxs, OStSh, 1, then L-"{s*s(x, s)} and 
L-{d%(x, s)/dx*} exist and converge uniformly in x and t. 


This follows immediately from inequality (16). 
LEMMA 2. Jf OSx Sm, 0<t; St Ste, then 
L-*{g(x, s)} 
exists and converges uniformly in x and t. 


Proof. We wish to show that, given any ¢>0, there exists an Qe), inde- 
pendent of x and ¢ such that, if w:>w,>Q, then 


f e'*g(x, s)ds + e**g(x, s)ds| 


By equation (15) and inequality (18), the order of integration can be inverted. 
Hence, consider 


f f exp [ts —| & — x| s'/?]dsdt 
0 


C3) exp [ts —| & — s*]dsdé}. 
0 


Since the integrands are analytic in s, the integrations on s can be carried 
out over the contour C:+C2+C3+C,+Cs+(Cz, where the C’s are straight line 
segments with end points as follows: Ci:y—wst, —wat; C2:—wx, 
Cs: Cai wnt; Cy? wrt, wat; Co: wat, 

The contribution of C, is 


0 
f f — exp [tf — toot — | — x| — 
0 
which is less in absolute value than 


3a(2 l/2gty+8z, 
f exp [BE + ty — | — x| 
0 


2 


A similar proof shows that the total contribution of C,:+C:+(C.+C is less 
than 


W. A. MERSMAN 


which approaches zero.as 2 approaches infinity, uniformly in x and ¢. 
Finally, the contributions of C,; and C; can be combined to give 


2 f “exp (=| 
-cos — (m/4) | & — x| 


Split the range of integration at §=£,>2x;. The contribution of the integral 
from £; to © is less in absolute value than ; 


2 f f exp [— 


On making the change of variable n= 8575? this is Sound to be less than 
& 


If 2>86? this integral converges, and hence can be made less than e by 
a suitable choice of &. Choose such a & and let it remain fixed. 
We have remaining the integral from 0 to &: 


& 
f exp [— | — «| (n/2)"?] 
0 @, 
-cos [tm — (x/4) —| — 
On making the substitution 
— | x| = ¢ 


and using the addition formula for the cosine, this is seen to be less in absolute 
value than 


& 
ar f exp [— — %)*/4t] cos [(m/4) + — 


fs 


plus the same expression with cos replaced by sin, where 
fn = (ton)? — | — /2(28)"2, 


By the second mean-value theorem this is equal to 


18 [January 
ee n = 1, 2. 


HEAT CONDUCTION 


& 
(19) | fs 


J, 


plus the same expression with cos replaced by sin, and {i =¢3S¢3.""- 
But the integrals sin {*df and cos converge ; hence there isa 
tive number M:such that ; 


cos <M> f° 


1 


cos 


sin 


for any values of {1, 3. Thus the expression (19) is less than 


On splitting the range of integration at =x this i is seen to be less than 
12(2)"*M 

which zero. as approaches infinity, 3 in wand t t; e. 4. 


Lemma 3. [f Sx, St Ste, then con- 
verges uniformly in x and t. 


:: Proof: Using the method and notation of the proof of Lemma 2, it is aa 
found that the ‘of Cit Cr Cot+Ceis less in- absolute value than: 


Similarly, the contribution of C:+(C, over the range is than 
exp [— 
Choose a fixed £, such that this is less than e. Then we have remaining the 
expression 
ar f | | exp-{—|¢— «| | 
0 $1 


By i means of the last bracket this can be separated into two terms, the second 
of which, as in the proof of Lemma 2, is less than ~ 


This leaves: 


1943}: 
19 


20 W. A. MERSMAN 


& fs 
2a f exp «| {¢ +] — 2] 


-cos — — x)*/8t]¢dy | dé. 
Integrating by parts with respect to { gives 


& 
att f et{exp [—| & — x| (w2/2)"/?] + exp [—| & — x| (w:/2)*/*] } 
& fe 
+ a(2¢8)-1/2 f xlexp[—|&—2| {¢ +|& — x| 


-sin [¢? — (€ — dé, 


which, as before, is less than 
+ 


which approaches zero as 2 approaches infinity, uniformly in x and #, q.e.d. 
The following can be proved by the same methods: 


Lemna 4, If O0Sx Sx, n=0, 1, then L-'{0"2(x, s)/dx"} exists 
and converges uniformly in x and t. 


Since | sg(x, s)| does not approach zero as |s| approaches infinity, the 
Laplace transformation method cannot be used to establish the existence(‘) 
of 8G/dt and 8*G/dx*. This will be done by classical methods by means of 
the following: 


Lemma 5. g(x, s) =L{G(x, t)}, where 


and then G(x, t)=L-"{g(x, s)}. 
Proof. The substitution =x 2¢t'/? shows that 


Hence L{G} exists and converges absolutely. By (20) the order of in- 
tegration in L{G} can be inverted, giving(’) 


(*) Cf. [C, Theorem 6]. 
(*) Cf. [D, Appendix 2, table]. 


HEAT CONDUCTION 


= — }dt = g(x, s). 


Hence, by [D, chap. 6, §5, Theorem 2], G is given by the inversion integral 
L-"{g}, q.e.d. 

Using these lemmas we are now in a position to establish the solution of 
the original boundary value problem. 

5. The solution established. 


THEOREM 1. The function W,(x, t)=L-"{w,(x, s)} exists and has the follow- 
ing properties : 

(a) Walx, #) = (1/2) f° { at) +O(x+é, at) }dt and W, is ob- 
tained from W, upon replacing a, by a2 and integrating from 0 to +. 

(b) W,(x, t) satisfies the differential equation (1). 

(c) W.{x, t) satisfies the initial condition (2) at any point of continuity 
of f(x). 

(d) lim,.e OW,/dx =0. 


Proof. Equation (a) follows from Lemma 5, and the remainder of the 
theorem is then easily proved by classical methods(°). 

To obtain the complete solution of the problem it is now sufficient to 
prove the following: 


THEOREM 2. The function Y,(x, t) = L-{ y,(x, s)} exists and has the follow- 
ing properties: 
(a) Y,(x, t) satisfies the differential equation (1). 
(b) lim Y,(x, ) = 0, x #0. 
-0o 


oY; OF: 
(c) lim ki: — = lim —>» #>0. 
z——0 Ox Ox 
oY, 
(d) Y.(+0, t) + W.(+0, t) Y,(—0, t) W,(—0, = lim rk, 3 t>0. 
x 
Proof. Y, exists by Lemma 1. 
(a) By Lemma 1 and inequality (16), Y, satisfies the hypotheses of [C, 
Theorems 6 and 9]. Hence, if x0, ¢>0, But 
=[- {d%y,/dx*} , and, since (11) can be differentiated inside the integral sign, 


4 
= a;'sy,, q.e.d. 
ax? Yn 


(b) By Lemma 1 and inequality (16) the hypotheses of [C, Theorem 4] 
are satisfied, from which the result follows immediately. 


(®) Cf. Carslaw, loc. cit., §§16-18 for a proof under more stringent conditions on f(x). 


1943] 21 


W. A. "MERSMAN 
(c) and (d) From equations (11) and (13) it is easily seen that 


(22) yo(+0, s) + w2(+0, s) — 8) — wi(—0,'s) = rR, 


By Lemma (2) and inequality (17) w,(x, s) satisfies the hypotheees of [C, 


Theorem 8]. Hence .W,(+0, =L-!{w,(+0, s)}, #>0. 
By Lemma 4 and inequality (16), y,(x, s) and dy,/dx cattails the hy- 
potheses(*) of [C1 Theorema 8 and Hence’ t>0 


and 
Ox 
Hence conditions (c) and (d) follow from (21) and (22), q.e.d. 
To summarize, the solution of the boundary value, ee (1)- @) is 


given by. 
(10’) U(x, 2) = W,(%#, + ¥,(x, 


where 
(13’) = f({O(x — at) + O(x'+ &, dé; 


W,(x, 4) is obtained from Wi(x, ¢) upon replacing a, by a; and 
from 0 to + ©, @ is given by (9), and bon’ 


(23) 1) = tm — im 


if y >326?, where y,(x, s) is given by equations (11). 
6. Explicit forms of Y. In the preceding section the function Y,(x, ¢) was 
obtained as a complex inversion integral. We now.develop two more proetce 


formulas. 
a THEOREM 3. Y,(x, t) és given by the following formulas: 


Y,(x, t) = f — + Bu, t)dé 
(24) 


(*) Actually, in [C] it is assumed that ay(zx, s)/ax =0( ls but the 
ay/ax = 0(|s|—*) given by our inequality (16), is sufficient, as can be seen from an examination 


of Churchill’s proof. 


1943] HEAT CONDUCTION 23 


and Y; is obtained from Y, upon enrehaneien the subscripts 1 and 2 and the 
integrals dt and f° .dé. 


Proof. Note that in equations (11) each term is of the form 
Since f* exp +Bs")y]du, we can write 
ina) = f exp [— (2 + + 


By Lemma 1, J(x, #) =L-*{j(x, s)} exists. Hence, to prove the present theo- 
rem it is sufficient to show that the integration on s can be performed inside 
the inner integral in (25). This will be done in two steps. 

First we show that the integrations on yu and s can be inverted. It is suffi- 
cient to prove that(?*) 


(26) lim exp [— (x + + Bu)s"/?]dtdp 
0 0 


aye 
converges uniformly in s, and 


(27) tim f exp [— (x + + Bu)s!?]dtds 
0 


converges uniformly in yp. 
To prove (26) note that the absolute value of the integral from yp, to 
is less than 


(28) exp [- (x + Bu)(| s| 


by inequality (16), with nm =1/2 and x replaced by x+Buy. But (28) is less than 
exp [— 2(| s| /2)¥*] exp [— {4 + B(| 5] ui] 
A + B(| s| 


which obviously approaches zero as approaches infinity, uniformly in s. 
To prove (27), the absolute value of the double integral is less than 


f exp [— 2(| s| /2)*/2]ds 
y—ot 


by inequality (16). But this converges and is independent of yu, q.e.d. Hence 


(*) Cf. Pierpont, Theory of functions of real variables, 1905, vol. 1 p. 489 §680.3. 


24 W. A. MERSMAN 


1 ytwi 
I(x, t) = lim — f exp [—(x + & + Bu)s"/? 
0 


0 we y—wi 


Now, by a proof similar to that of Lemma 5, or by inequalities (16) and 
(18), the inversion of the integrations on £ and s can be justified. Therefore 


J(x,t) = f om f exp [— (x +&+ Bu)s/?]} d&dy. 
0 0 
But the inner integral is well known(!'). Hence, 
J(x,t) = An e Bu, dp, q.e.d. 
= foe f + Bu, qe 


Another form of Y is given by the following 


THEOREM 4, 


Y,(x, 4) = (by/Bax'*) exp [A%(¢ 7)B-*] 


+ f S(HO(x + &, bar, 


and Y,(x, t) is obtained from Y,(x, t) on interchanging subscripts 1 and 2 and 
the integrals fy and f° .. 

Proof. On observing that — 4 B-], 
and that L-'{(s—A*B-*)“} =exp [A%B-*], the result follows immediately 
from the “Faltung theorem(!*)” since all the Laplace transformations con- 
cerned are absolutely convergent. (Compare the proof of Lemma 5 above.) 


UNIVERSITY OF CALIFORNIA, 
Davis, CALIF. 


(4) Cf. [D, Appendix 2, table ]. 
Cf. [D, chap. 8. §5, Theorem IV, ]. 


(29) 


CONGRUENCES IN UNITARY SPACE 


BY 
N. COBURN 


1. Introduction. In this paper, we shall study the properties of a congru- 
ence of «*~! curves which are imbedded in a unitary space of ” dimensions K, 
(a real topological space of 2” dimensions). First, we consider the general 
case—when the curves are ©*~! unitary curves K;, (real topological spaces of 
two dimensions)—and determine the associated congruence affinors. Then, 
we determine the necessary and sufficient conditions in terms of congruence 
vectors that the «*—' congruence curves should be either unitary U; (unitary 
Euclidean curves)(!) or real curves X; (real topological spaces of one dimen- 
sion). If the curves of the congruence are all real X1, then we define the con- 
gruence to be real; if the curves are all unitary Ui, then we define the congru- 
ence to be complex Euclidean. 

In the next section, we study two systems of Pfaffians which enable us 
to define two types of orthogonality: (1) «' hypersurfaces which are com- 
pletely unitary orthogonal to the congruence curves; (2) «©! hypersurfaces 
which are semi-unitary orthogonal to the congruence curves. It is shown that: 
(1) the «! hypersurfaces which are completely unitary orthogonal to the 
congruence curves admit of an intrinsic parameterization and are ' unitary 
K,-1; (2) if the «1 hypersurfaces which are semi-unitary orthogonal to the 
congruence curves admit of a parameterization, then they constitute ©! semi- 
analytic(?) spaces X,-;. A further analytical characterization of these two 
types of surfaces is given. 

The remainder of our work deals with two problems: (1) a characteriza- 
tion in terms of congruence affinors of those congruences which are either 
completely unitary orthogonal or semi-unitary orthogonal to «1! hypersur- 
faces in K,,; (2) special properties of these two types of congruences. Thus, in 
connection with the second problem, it is shown that if the congruence is 
either real, or complex Euclidean, analytic and completely unitary orthogonal 
to ©! hypersurfaces, then the conditions satisfied by the congruence vector 
are similar to those satisfied by the congruence vector which is orthogonal 
to «1 hypersurfaces(*) in V, (#-dimensional Riemannian space). Again, if: 
(1) the congruence is real and geodesic; (2) the K, has a symmetric connec- 
tion, then every two hypersurfaces which are semi-unitary orthogonal to the 


Presented to the Society, February 28, 1942; received by the editors February 3, 1942. 
(*) [5, vol. 2, p. 251]. 

(*) [3, equation (2.10) ]. 

(*) [S, vol. 2, p. 28, equation 5.2]. 


26 N. COBURN [January 


congruence intercept equal arc segments on all X; of the congruence. This 
latter result is similar to a theorem(*) in Riemannian space. 

2. Notation(®). Consider a real space of 2m dimensions X:, whose coordi- 
nates are given by the real variables 


(2.1) 

Into this X2,, we introduce the complex coordinates 

(2.2) P= xX + iy, 4 = (— 1)", 
(2.3) = — iy. 

Since the Jacobian of this transformation (—2) does not vanish over X2n, 
the &', —* constitute a set of 2m independent variables which map the X2,. 
In view of the fact that ~* are complex conjugate to *, we can determine 
the points of X2, by assigning complex numbers to merely £*. Hence, we say 
that the £* determine “points” which build a complex space of m dimensions 


(the above real topological X2,). 
Let us denote partial derivatives by 


(2.4) a, = = 


If y(é*, *) is an analytic function of £, —*, then we shall say that y is semi- 
analytic; if ¢(£*) is an analytic function of £* (or —*) alone, then we shall 
say that ¢ is analytic. In view of (2.4), we may express this last condition by 
(2.5) a, = 0. 
One further important formal idea must be noted—that of the conjugate 
function and equation. If we replace i by —i in @¢, the resulting function is 
denoted by ¢* (where ¢ is a scalar). From (2.2), (2.3), we see that —* must 
be replaced by £* and vice versa. Hence $(£*) becomes $*(£""). In the case 
of affinors, the conjugate affinor is obtained in the same manner. However, 
we shall indicate this conjugate by starring the previously unstarred indices 
and removing the star from the previously starred indices. Thus the conjugate 
of v,* is vy*,. Furthermore from our discussion, it follows that 
(2.6) = 0. 
The equation (2.6) is the so-called conjugate equation to (2.5). Also to every 
affinor equation, there corresponds a conjugate equation obtained by replac- 
ing « by —# and hence each affinor by its conjugate. The truth of this last 
statement can be seen by decomposing each affinor into its real and imaginary 
parts(®). In the following, we shall indicate the validity of the conjugate equa- 
tion by the abbreviation “conj.” 

(*) [S, vol. 2, p. 31]. 


(*) Our notation is that of [5]. 
(*) Note, by composite differentiation, it follows that 0,=0/dx"—10/dy", d,°=9/dx" 


1943] - CONGRUENCES IN UNITARY SPACE 27 


.... (We specify that the group of this complex X, shall be the analytic group(") 
of coordinate transformations. Now, let us introduce a connection in X, by 
means of the quantities which are functions of the coordinates 
We define the covariant differential of a contravariant vector o*(£*, —*) by 
(2.7) = dv + Trav dt’, conj. 


Likewise, we define the covariant differential of a covariant vector w(£, =") 
by 3 | 

(2.8) du, = dw — conj. 

By expanding the ordinary differential of a vector, we obtain 

(2.9) = + conj., 

(2.10) dw, = + conj: 

If we define the covariant derivative of v*, w, by the equations 

(2.11) = a0. + = — conj., 

(2.12) =30, = conj., 

then by use of the ee (2.9) through (2. 12), the equations - 7), (2.8) 
become 

(2.13) = + d#V,-0", conj., 

(2.14) dw, = dV,w + conj. 


An hermitian X, ‘with covariant derivative defined by (2.11), (2.12) is de- 


noted by K,. 
Let us introduce an hermitian tensor with hermitian symmetry, that is, 


(2.15) Dy = = 


the sign (’) indicating the transpose matrix. If we condition the a),+ by re- 
quiring that 


then the space K, is said to be a unitary K,. For such a space from (2.16), 
we can show(®) 

(2.17) = — = 0, 

(2.18) = — = 0. 


The a,,* is now a fundamental tensor and caa be used to raise and lower in- 


(7) The analytic group of transformations is given by = >’ (£*), conj. 
(*) [5, vol. 2, p. 234]. 


28 N. COBURN [January 


dices through the V operator. If we define the contravariant fundamental 
tensor a** by 


(2.19) = A>, conj., 


where A> is the unit affinor, then (2.17), (2.18) may be solved for the connec- 
tion 

(2.20) Ta = 

Finally, we introduce the torsion affinor 


(2.22) Sia = (1/2)(Ta — Th) = conj. 


The sign [ ] means that the antisymmetric product of the enclosed indices 
is to be formed; the sign | | enclosing ind:ces means that those indices are 
to be excluded in forming the antisymmetric product. When the torsion 
affinor can be written as 


(2.23) = conj., 


the unitary space K, is said to have a semi-symmetric connection. 
3. Congruences in unitary K,. Consider a vector field u*(é*, =") defined 
over the unitary K,. The system of differential equations in the parameter ¢, 


(3.1) d?*/u* = dt, conj., 


is said to define a congruence(*) of «*~' curves in the unitary K,. We shall 
study the decomposition of the affinors V.%,, Consider affinors 
which we define as the projections of V.%,, Va-t#,, respectively, upon the local 
U,-1 which is unitary orthogonal('*) to u,. Hence, it follows that 

(3.2) u*l., = 0, u“la» = 0, conj., 

(3.3) tla, = 0, ula, = 0, conj. 

Furthermore, let we, 2a, Xa, Ya be four arbitrary vectors in the above local 
U,-1, that is, 


(3.4) = = = = 0, conj. 


We can now write(!) 
(3 .5) Vat» = lay + + + conj., 


(*) [5, vol. 2, p. 27, equation 5.1]. 

This local U,_; is determined by those vectors (subscript j=1, - - , s—1) which 
are solutions of =0. 

() [5, vol. 1, p. 19, 


1943] CONGRUENCES IN UNITARY SPACE 


(3.6) Varttn = lan + hart + Vary + conj., 


where , g are scalars. 

If the parameter ¢ in (3.1) is complex and the congruence curves are U; 
or if the parameter ¢ is real and hence the curves are X;, then an analytic arc 
length parameter s exists(**) 


(3.7) s = s(é), conj. 


Now, if we replace the parameter ¢ by s in the @*~! congruence curves, then 
the associated congruence vector “* (we indicate the vector by the same sym- 
bol as before) is a unit vector, that is, 


(3.8) uu, = 1, conj. 


Because of (3.8), certain relations exist between the affinors in (3.5), (3.6). 
Before finding these relations, we formulate 


DEFINITION 1. (a) If the parameter t in (3.1) ts real, then the congruence 
defined by (3.1) will be said to be real. This congruence consists of ~*'X,in K,; 
(b) af the parameter t is complex but the *—' curves of the congruence are U,, 
then we shall say that the congruence is complex Euclidean. 


By covariant differentiation of (3.8), we obtain 
(3.9) (Vat) = — (Yau), conj. 
As a consequence of the equation 
(3.10) = ug, conj., 


we find that the right-hand side of (3.9) can be expressed in terms of Vatg-, 
that is, 


(3.11) (Vattr)u* = — conj. 

By use of (3.5), (3.6), the relation (3.11) can be shown to be equivalent to 
(3.12) Za = — Yay CON}j., 

(3.13) p = — q, conj. 


Conversely, if (3.12), (3.13) are valid, then the validity of (3.11), (3.9) fol- 
lows. But (3.9) may be written in the form 


(3.14) Va(%e*) = 0, conj. 
Hence, it follows that 
(3.15) = Conj., 


where c is some arbitrary constant in the unitary K,. By use of (3.1), the 
(2) [1, Theorems 3, 4]. 


30 


equation (3.15) becomes 

(3.16) = c. 

But this means that the curves of the congruence are «© *-! UJ, (for com- 
plex #)("*) or © X; (for real t). Hence, we have the theorem. 


THEOREM 1. The necessary and sufficient conditions that the solutions u, of 
(3.5), (3.6)—when they exist—should define either a real — or a com- 
plex Euclidean congruence ts that 24= —Ya, P= —4. 


4. Two systems of Pfaffians. Let us consider a general congruence vector 
u,(é*, *). In the first place, we associate with this vector a system of two 


Pfaffians 
(4.2) md?" = 0. 


Assuming that 1, “+ do not vanish over some region D of the unitary K,, 
we can rewrite the two previous equations in the form 


(4.3) — 1) a= 2, - My 
(4.4) dt! = — 


where >,’ denotes summation over all repeated indices with the exception of 
the index 1. If the integrability conditions of thie system are satisfied, we can 


solve("*) for £1, 


where: (1) &', (subindex 0) are arbitrary constants; (2) &*, (a=2, - -,m) 
are the independent variables; (3) £', =" are the dependent variables. By solv- 
ing for £', £* (subindex 0), we obtain the two independent integrals of (4.1), 


(4.2), 
(4.7) = 2), 


(4.8) | = &): 


We now prove 
Lema 1. The two independent integrals f, g are conjugate functions. 


The equations (4.5), (4.6) become identities when the variables ~*, —* 
(a=1, 2, +,.2) are replaced by the arbitrarily assigned constants £*, 


[4, p. 49]. 


N. COBURN ;. [January 
: 


1943} CONGRUENCES IN UNITARY SPACE 31 


(subindex 0). ‘Hence (4.7), (4.8) become identities when -the same substitu- 
tion is made. That is, the functions f and g reduce to conjugate quantities 
(subindex 0) when the (A=1, 2, - - - , 2) are assigned arbitrary 
values. Thus, f and g are conjugate functions. 

Since the quantity £" (subindex 0) is known when the quantity &' (sub- 
index 0) has been assigned some arbitrary value, we shall say that (4.7), (4.8) 
determine ~? integrals of (4.1), (4.2). We prove 


LemMA 2. The «' integrals of (4.1), (4.2) determine ~' unitary hypersur- 
faces in K,,. 

Let us denote (a=2,---, m) by uw, uw” (a=1,---, n—1), re- 
spectively; these u*, u* will serve as the hypersurface intrinsic parameters. 
Thus, (4.7), (4.8) or (4.5), (4.6) can be written as 
(4.9) = £*), conj. 

If the u* actually occur in the right-hand side of (4.9), then these equations 
define «©! semi-analytic(?) hypersurfaces X,-1 in the unitary K,. We shall 
show that these « do not occur. By forming the total differential of (4.9), 
we. obtain 


(4.10) dt! = du*d,t' + du*d,€', conj. 
Since the du*, du“ are equal to the differentials dé*, dE (a=2,---, m) of 


the independent variables é*, ¢*, we find by comparing (4.10) and its con- 
jugate with (4.3), (4.4) that 


(4.11) = 0, conj. 

Hence (4.9) may be written in the form 

The equations (4.12) determine ©! analytic hypersurfaces X,.; in the unitary 


K,. Such analytic hypersurfaces are always. unitary(™) K,-1. Hence our 


lemma is proved. 
In view of the fact that (4.1), (4. 2) are weeny orthogonality relations, we 
state 


DEFINITION 2. The integrals f =const. and f* =const. of (4.1), (4.2) will be 
said to define ~' hypersurfaces in the unitary K, such that the hypersurfaces 
are completely unitary orthogonal to the congruence vector ur. 


We may restate Lemma 2 in terms of Definition 2 as Geliabias 


(4) [5, vol. 2, p. 245]. 


32 N. COBURN : {January 
LEMMA 3. The ~' hypersurfaces which are completely unitary orthogonal to 
the congruence vector u, are ©' unitary K,_; in the unitary K,,. 
Let us now consider the single Pfaffian 
(4.13) ud? + md = 0. 


By assuming that ~ does not vanish over some domain D of the unitary K,, 
we can construct a theory of this Pfaffian in which (4.3) is replaced by 


The equation (4.5) becomes 

0 
Furthermore, (4.13) has only one independent integral 
(4.16) f(®, 2). 

0 

The equation corresponding to (4.4) is identical with (4.14); the equation 
corresponding to (4.6) is the conjugate of (4.15). Finally, the equation corre- 
sponding to (4.8) is equivalent to the conjugate of (4.16). However, this last 


equation is trivial since if f* =const. is an integral of (4.13), then f= F(/*). 
We now define a new term. 


DEFINITION 3. The integral f=const. of (4.13) will be said to define ~' 


hypersurfaces X,,_, in the unitary K, which are semi-unitary orthogonal to the 
congruence uy. 


These semi-unitary orthogonal hypersurfaces can be characterized by 
their parameter representation. We prove 


LEMMA 4. The ~' semi-unitary orthogonal hypersurfaces to the congruence uy 
cannot possess an analytic parameter representation of rank (n—1). That is, 
these X,-1 are not unitary K,-1. 


Let us assume the contrary, namely, that these hypersurfaces possess an 
analytic parameter representation. 


(4.17) = P(u*), A=1,---,m;a=1,---,n — 1, conj. 


Since the rank of (4.17) is (n—1), we can solve for the (~—1)* in terms of 
(n—1) of the (say, a=2,---,n), 


(4.18) u* = u*(é*), conj. 
Substituting (4.18) into the first equation of (4.17), we obtain 
(4.19) = conj. 
Forming the total differential of (4.19), we find 


1943] CONGRUENCES IN UNITARY SPACE 33 


(4.20) dé! = dt-d,¢', a =2,---,m, conj. 
Remembering that &, &“, (a=2, - - +, m) are independent variables and 
comparing with (4.14), we obtain 

(4.21) = 0, 1,---, Conj. 
Hence the congruence vector vanishes. Thus, the assumption (4.17) is false 
for a non-vanishing congruence; our lemma is proved. Our lemma implies 
that if the class of semi-unitary orthogonal hypersurfaces can be parameter- 
ized, the parameterization is semi-analytic (see 4.9). 

Another relation exists between the ! completely unitary orthogonal 
hypersurfaces and the ©! semi-unitary orthogonal hypersurfaces. Let us con- 
sider the systems of partial differential equations associated with (4.1), (4.2) 
and (4.13). The system associated with (4.1), (4.2) is 
(4.22) Oaf = 0, 

(4.23) = 0. 
. The system associated with (4.13) is composed of (4.22), (4.23) plus the addi- 
tional equation 


If 0:f, 0:-f do not vanish over the domain D in which 1, u- do not vanish» 
then the non-vanishing scalars p, y exist such that (4.22), (4.23)—hence (4.1), 
(4.2)—are equivalent to 


(4.25) uy = pVaf, 
(4.26) = 


The equation (4.13) is equivalent to (4.25), (4.26) plus the additional equa- 
tion (4.24). However, the latter implies 


(4.27) = p. 


Before proceeding to enumerate these new results, we note that if f=const. 
is a solution of (4.22) through (4.24), then f* =const. is a solution of the same 
equations (see the discussion following 4.16). Hence the word “conjugate” can 
be written after equations (4.25) through (4.27). 


Lemma 5. The solutions f =const. (and its conjugate) of (4.25), (4.26) where 
determine completely unitary orthogonal hypersurfaces (unitary K,-1) 
to the congruence. The solution f =const. of (4.25), (4.26) where p= determine 
co! semi-unitary orthogonal hypersurfaces X,-1 to the congruence. 


5. Congruences completely unitary orthogonal to ~' K,_, in K,. We con- 
sider congruences which are completely unitary orthogonal to «! unitary 


34 N. COBURN [January 


K,-: in unitary K,. The integrability conditions(") of (4.1), (4.2) are 
(S.1) (pla) + up) + = 0, Conj., 
(5.2) Ug0 — = 0, Con). 

By use of (2.11), (2.22), we find 

(5.3) Opa} = Viste) + Spa My, conj., 
(5.4) = Vary, CON}. 

Thus (5.1), (5.2) become 

(5.5) = — UpSpaj%y, CONj., 
(5.6) = 0, conj. 

By transvecting (5.2) or (5.6) with «*, we find 

(S.7) = Conj. 
Replacing the right-hand side by (3.6), we obtain 

(5.8) Varks = + CON). 
Hence, upon comparing (5.8) with (3.6), we find that 
(5.9) -lan = 0, x, = 0, conj., 


is a consequence of the integrability conditions (5.6). We now study the mean- 
ing of the integrability conditions (5.5). Let us assume that the connection 
of K,, is semi-symmetric (see 2.23). Then the equation (5.5) reduces to 


(5.10) uplga} = 0, conj., 


in consequence of (3.5) and (5.5). By transvecting (5.10) with u* and using 
(3.2), (3.3), this equation becomes 


(5.11) liga) = 0, conj. 


Conversely, if (5.9) and (5.11) are valid and if the connection of K,, is semi- 
symmetric, then the expressions (3.5), (3.6) satisfy the integrability condi- 
tions (5.5), (5.6). This leads us to 


THEOREM 2. Consider a unitary space K,, with semi-symmetric connection 
and such that the solutions uy of (3.5), (3:6) exist then if and only if: (1) lag is 
symmetric; (2) lag, x, vanish, does the vector u, define a congruence which is 
completely unitary orthogonal to ~' hypersurfaces in the unitary K,. 

(5) [4, p. 29, equation 23]. Since the m, m* in (4.1), (4.2) are functions of # and &, the 
complete Pfaffians in (4.1), (4.2) can be written as wed?" +md?=0, =0, where 
w,, ws =0. If one writes out the equation 23, p. 29 of [4], then for unstarred variables (or in- 
indices) the equation (5.1) results; if one of the starred variables (or indices) is used then equa- 
tion (5.2) results. 


1943] CONGRUENCES IN UNITARY SPACE 35 


Let us now restrict ourselves to real congruences (see Definition 1 (a)). 
By introducing the Frenet formulas(") for the ©*~! curves X; of the congru- 
ence, we can determine the meaning of the vector w, in (3.5). From the Frenet 
formulas, it follows that 
(5.12) + Var, = ky + km, conj., 

00 011 
where k (subindex 00, 01) are curvatures and ) (subindex 1) is the first nor- 
mal of each X, in the unitary K,. By use of (3.5), (3.6), (3.13), we find 


(5.13) + = Ww + m+ (p — p*)m, conj. 
By comparison of (5.12), (5.13), we obtain 


(5.14) + = km, conj., 
011 


(5.15) ? — p* = k, conj. 


If we require that the congruence shall be completely unitary orthogonal to 
co ! hypersurfaces in the unitary K,, then it follows from Theorem 2 that the 
vector x, vanishes. Hence, we obtain the result 


THEOREM 3. If the congruence is real and completely unitary orthogonal to ~' 
hypersurfaces in the unitary K,, then (1) the vector wy lies along the first normal 
to any X; of the congruence; (2) the magnitude of wy is equal to the (0, 1) curva- 
ture of X1; (3) the imaginary part of the scalar p is one-half the (0, 0) curvature 
of xX. 1- 

Again, let us consider the case where the congruence is completely unitary 
orthogonal to ! hypersurfaces in the unitary K, and where the congruence 
is either real or complex Euclidean. By means of Theorems 1 and 2, we may 
write (3.5), (3.6) in the form 


(5.16) = lan + + ham, conj., 
(5.17) Var, = — hart, COnj., 

where 

(5.18) ha = fa + Pitta, Conj. 


From (5.17), we see that if V.-, vanishes, then 4, vanishes. Hence from 
(5.18), we find that the vector z, and the scalar p vanish. By (5.15), this last 
result means that the curvature k (subindex 00) of a real congruence vanishes. 
Furthermore, the equations (5.16) and (5.11) furnish the result 


(5.19) Vat, = + CON). 


(8) [2, equation 3.23]. 


36 N. COBURN [January 


The equation (5.19) is the condition(*) satisfied by a congruence of curves V; 
which are orthogonal to © ' hypersurfaces in a Riemannian space of  dimen- 
_ sions V,. Hence, we have the result 


THEOREM 4. If the vector u, determines a congruence which is: (1) completely 
unitary orthogonal to 1 hypersurfaces in the unitary K,; (2) the vector uy is 
analytic; (3) the congruence is either real or complex Euclidean, then the condi- 
tions satisfied by the congruence vector u, in the unitary K,, are identical with 
those satisfied by the congruence vector uy which is orthogonal to »' hypersurfaces 
in V,. If the condition (3) is replaced by the stronger requirement that the congru- 
ence is real, then a further conclusion follows. Namely, the curvature k (subin- 
dex 00) vanishes. 


6. Congruences semi-unitary orthogonal to ©! X,_,in K,. If the congru- 
ence is semi-unitary orthogonal to ©' hypersurfaces in the unitary K,, then 
the integrability conditions of (4.13) must be satisfied. These are given by 


(6.1) U0 Ug9 = 0, conj., 
(6.2) U0 + + = 0, conj. 


By use of (5.3), (5.4), the equations (6.1), (6.2) become 


(6.3) = — UpSpa] conj., 
(6.4) = — (1/3) ty, conj. 


Let us assume that the connection of K, is semi-symmetric (see 2.23). Then, 
the right-hand side of (6.3) vanishes and the right-hand side of (6.4) becomes 
(—(1/3)ug-uaPr)). Upon substituting (3.5) into (6.3), we obtain 

(6.5) Uplga} = 0, conj. 

Transvecting with u* and using (3.2), (3.3), we conclude that 

(6.6) liga) = 0, conj. 


Conversely, if (6.6) is valid, then (3.5) satisfies (6.3). We next study the con- 
sequences of (6.4). By transvecting (6.4) with u*, u®, we obtain 


(6.7) + + = — CON)., 
(6.8) + + Vata) = — CON). 


Due to the symmetry of (6.4), no additional relations are obtained by further 
transvection with u*. With the aid of (3.5), (3.6), (6.6), the two previous equa- 
tions become 


+ + = — apr}, CON)., 
+ Vishal) + + 


(6.10) 
+ + = — MfaPrj, CON). 


it 


1943] CONGRUENCES IN UNITARY SPACE 


Simplifying with the aid of (3.4), we find 

(6.11) — + 2 — yx — = — — m(Pau*)], conj., 

(6.12) — + 2 — — — Pr) + + Za — Ya — Wa — Pa) = 0, conj. 
Transvecting (6.11) with u® and simplifying with (3.2), (3.3), we obtain 
(6.13) — — WM = fPr— Conj. 

Substituting (6.13) into (6.11), we find 

(6.14) = 0, conj. 


By substituting (6.13) into (6.12), we find that the latter equation is identi- 
cally satisfied. Thus (6.13), (6.14) are the only equations obtained by trans- 
vecting (6.4) with «*. Conversely, by expanding (6.4) and using (3.5), (3.6), 
we find that in virtue of (6.6), (6.14), (6.13), the equation (6.4) is identically 
satisfied. This leads us to a theorem which is similar to Theorem 2, 


THEOREM 5. Consider a unitary space K,, with semi-symmetric connection 
and such that the solutions uy of (3.5), (3.6) exist, then if and only tf: (1) lap, Lars 
are symmetric; (2) x. does the vector uy define a 
congruence which is semi-unitary orthogonal to ~' hypersurfaces in the unt- 
tary K,,. 


By use of Lemma 5, we can obtain some further properties of the semi- 
unitary congruences. From (4.25), (4.26), (4.27) it follows that for congru- 
ences which are semi-unitary orthogonal to «! hypersurfaces in the uni- 
tary K, 


(6.15) u = pVrf, conj., 

(6.16) = conj. 

By covariant differentiation of (6.15), (6.16), we find 

(6.17) Vat, = pVaVaf + (Vaf)(Vap), Conj., 

(6.18) = pVaVarf + (Vaef)(Vap), conj. 

: From the relations (2.11), (2.22), we see that 

(6.19) VaVaf = VaVaf + Sra Vf, conj., 

(6.20) VaVaef = VarVaf, conj. 

Substituting the last two equations into (6.17), (6.18), the latter become 
(6.21) Vater = pVVaf + 2pSi2 Vif + (Vaf)(Vap), conj., 

(6.22) Vattre = + (Va-f)(Vap) conj., 

Simplifying (6.21), (6.22) by use of the equations (6.15) through (6.18), we 


38 N. COBURN 


obtain 

(6.23) Vat, = Vita + 2Sia My + 2p COnj., 

(6.24) = + 2p Conj. 

By use of (3.5), (3.6), (6.6), (6.14), the above two equations become 
(6.25) + Stat) = Sra My + Conj., 

(6.26) + Vtatr) = p Conj. 


Let us assume that the connection of K, is semi-symmetric. Transvecting the 
previous two equations with u*, we obtain 


(6.27) m— a= — prt m(pau*) — p+ ln p), conj., 
(6.28) tre — = — ln p ln p), conj. 
We are now in a position to prove 


THEOREM 6. Jf: (1) the connection of the unitary K, is semi-symmetric; 
(2) the congruence is semi-unitary orthogonal to ~' hypersurfaces in K,,; (3) the 
congruence is real or complex Euclidean; (4) w,=2,, X.=Yn, then every two hy- 
persurfaces X,_; intercept equal arc segments on all curves of the congruence. 


From condition (4) of our theorem, that is, 
(6.29) W = 2, = Ya, COnj., 
it follows by use of (6.13) that 
(6.30) Pr = (Pat*)m, conj. 
Substituting (6.29), (6.30) into (6.27), (6.28), we find 
(6.31) = conj., 
(6.32) Uy* = Conj., 


where @ is some function of £*, £*". Thus p is an integral of the system (4.22) 
through (4.24). Since that system has only one independent integral, namely, 
it follows that 


(6.33) p = F(f), conj., 
where F(f) is some arbitrary function of f. From (6.33), (6.15), (6.16), we find 
(6.34) F(f)df = ud? + 


for arbitrary dt*, d*. Now let us consider the vector (dé*, d=") as in the 
direction of u*. By multiplying and dividing the right-hand side of (6.34) by 
ds (the element of arc length along a curve of the congruence), we obtain 


(6.35) F(f)df = 2ds. 


[January 

ts 

| 

- 

2 


1943] CONGRUENCES IN UNITARY SPACE 
Integrating (6.35) between f=¢o, f=c, we find 


(6.36) f F(fdf. 
The fact that the right-hand side of (6.36) is independent of any particular 


curve of the congruence proves our theorem. 
We can obtain the geometric meaning of the essential condition (4) in 


Theorem 6 by limiting ourselves to real congruences. We prove 

Lemma 6. Consider a real congruence which is semt-unitary orthogonal to ~+ 
hypersurfaces in a unitary K, with semi-symmetric connection, then tf and only 
af: (1) the (01) curvature of each vanishes; (2) px = (paté*) un, are the equations 
(6.29) valid. 

First, we show the sufficiency of our conditions. From the first condition, 
we have 


(6.37) k= 0, conj. 


Hence from (5.14), it follows that 

(6.38) wW + x = 0, conj. 

Since the congruence is real, the equation (3.12), is valid, that is, 
(6.39) 2 = — Conj. 

By use of the second condition and (6.13), we obtain 

(6.40) +.2.— — = conj. 


By substituting (6.38), (6.39) into (6.40), we obtain the equations (6.29). 
Conversely, from (6.29) and the fact that the congruence is real, it follows 
that the conditions (1), (2) of our theorem are satisfied. Thus, from (6.29) 


and (6.13), we obtain 
(6.41) Pr = (Paté*)u,, COnj. 


Furthermore, by use of (6.39), (6.29) and (5.14), we find that k (subindex 01) 
vanishes. 

We now translate Theorem 6 into terms connected with a real congruence 
of geodesic curves. Our result is 


THEOREM 7. If the curves X; of a real congruence in a unitary space K,, with 
semi-symmetric connection satisfy the conditions: (1) the congruence is semi- 
unitary orthogonal to «1 hypersurfaces in K,; (2) the «*-! X, are geodesic; 
and either (3) the curvature k (subindex 01) vanishes; or (4) px = (pau%)uy, then 
every two hypersurfaces intercept equal arc segments on all X, of the congruences. 


N. COBURN 


It has been shown(!”) that an X; in a unitary K, with semi-symmetry 
connection is geodesic if and only if 


(6.42) Pr = — ku, conj., 
011 


(6.43) k = 0, conj. 
00 
The condition (6.43) is of no use to us. From (6.42), we note that if condition 
(3) of our theorem is valid, then the condition (4) is necessarily satisfied, and 
conversely. From Lemma 6, it follows that (6.29) is valid. Hence, Lemma 6 
leads to the desired conclusion. 
If in particular, the space K, has a symmetric connection, then 


(6.44) pr = 0, conj. 
Thus the conditions (3), (4) of Theorem 7 are satisfied. Theorem 7 becomes 
analogous to a theorem in Riemannian space(‘). 


REFERENCES 


1. N. Coburn, Unitary curves in unitary space, Revista de la Universidad Nacional de 


Tucuman vol. 2 (1941) pp. 159-167. 
2. , Frenet formulas for curves in unitary space, Journal of Mathematics and Physics, 


Massachusetts Institute of Technology vol. 21 (1942) pp. 10-18. 
3. , Semi-analytic unitary subspaces of unitary space, Amer. J. Math. vol. 64 (1942) 


pp. 714-724. 
4. T. Levi-Civita, The absolute differential calculus, London, 1929. 
5. J. A. Schouten and D. J. Struik, Einfuhrung in die Neueren Methoden der Differential- 


geometrie, Groningen, vol. 1, 1935; vol. 2, 1938. ’ 


UNIVERSITY OF TEXAs, 
AustTINn, TEXAS 


(*7) (2, Theorem 4]. 


40 


RECURSIVE PREDICATES AND QUANTIFIERS(’) 


BY 
Ss. C. KLEENE 


This paper contains a general theorem on the quantification of recursive 
predicates, with applications to the foundations of mathematics. The theorem 
(Theorem II) is a slight extension of previous results on Herbrand-Gédel 
general recursive functions(*), while the applications include theorems of 
Church (Theorem VII)(*) and Gédel (Theorem VITII)(*) and other incom- 
pleteness theorems. It is thought that in this treatment the relationship of 
the results stands out more clearly than before. 

The general theorem asserts that to each of an enumeration of predicate 
forms, there is a predicate not expressible in that form. The predicates con- 
sidered belong to elementary number theory. 

The possibility that this theorem may apply appears whenever it is pro- 
posed to find a necessary and sufficient condition of a certain kind for some 
given property of natural numbers; in other words, to find a predicate of a 
given kind equivalent to a given predicate. If the specifications on the predi- 
cate which is being sought amount to its having one of the forms listed in 
the theorem, then for some selection of the given property a necessary and 
sufficient condition of the desired kind cannot exist. 

In particular, it is recognized that to find a complete algorithmic theory 
for a predicate P(a) amounts to expressing the predicate as a recursive predi- 
cate. By one of the cases of the theorem, this is impossible for a certain P(a), 
which gives us Church’s theorem. 

Again, when we recognize that to give a complete formal deductive theory 
(symbolic logic) for a predicate P(a) amounts to finding an equivalent predi- 
cate of the form (Ex)R(a, x) where R(a, x) is recursive, we have immediately 
Gédel’s theorem, as another case of the general theorem. 

Still another application is made, when we consider the nature of a con- 
structive existence proof. It appears that there is a proposition provable clas- 
sically for which no constructive proof is possible (Theorem X). 

The endeavor has been made to include a fairly complete exposition of 
definitions and results, including relevant portions of previous theory, so that 


Presented to the Society, September 11, 1940; received by the editors February 13, 1942. 
In the abstract of this paper, Bull. Amer. Math. Soc. abstract 46-11-464, erratum: line 4, 
for “for.” read “for all.”. 

(*) A part of the work reported in this paper was supported by the Institute for Advanced 
Study and the Alumni Research Foundation of the University of Wisconsin. 

(2) Gédel [2, §9] (see the bibliography at the end of the paper). 

(*) Church [1]. 

(*) Gédel [1, Theorem VI]. 


41 


42 S. C. KLEENE ‘ [January 


the paper should be self-contained, although some details of proof are omitted. 

The general theorem is obtained quickly in Part I from the properties of 
the u-operator, or what essentially was called the p-function in the author’s 
dissertation(®). Part II contains some variations on the theme of Part I, and 
may be omitted by the cursory reader. The applications to foundational ques- 
tions are in Part III, only a few passages of which depend on Part II. 


I, THE GENERAL THEOREM ON RECURSIVE 
PREDICATES AND QUANTIFIERS 
1. Primitive recursive functions. The discussion belongs to the context of 
the informal theory of the natural numbers 


0,1,2,-++,% #,***. 


The functions which concern us are number-theoretic functions, for which 
the arguments and values are natural numbers. 

We consider the following schemata as operations for the definition of a 
function ¢ from given functions appearing in the right members of the equa- 
tions (c is any constant natural number): 


(I) o(x) = x’, 
(Il) $(*1, Xn) = 6, 
(III) Xn) = Xi, 
{ ¢(0) =c 
o(y’) = x(y, (y)), 
o(y Xn) = x(y, o(y, — Za), Xn). 

Schema (I) introduces the successor function, Schema (II) the constant 
functions, and Schema (III) the identity functions. Schema (IV) is the 
schema of definition by substitution, and Schema (V) the schema of primitive 
recursion. Together we may call them (and more generally, schemata re- 
ducible to a series of applications of them) the primitive recursive schemata. 

A function which can be defined from given functions - - - , by 


a series of applications of these schemata we call primitive recursive in the 
given functions; and in particular, a function ¢ definable ab initio by these 


means, primitive recursive. 
Now let us consider number-theoretic predicates, that is, propositional 


functions of natural numbers. 


Kleene [1, §18). 


(Va) 


iv 


1943] PREDICATES AND QUANTIFIERS 43 


In asserting propositions, and in designating predicates, we use a logical 
symbolism, as follows. Operations of the propositional calculus: & (and), 
V (or), — (not), — (implies), = (equivalent). Quantifiers: (x) (for all x), 
(Ex) (there exists an x such that). These operations may be taken either in 
the sense of classical mathematics, or in the sense of constructive or intuition- 
istic mathematics, except where one or the other of the two interpretations 
is specified. 

A predicate P(x, - --, x,) is said to be primitive recursive, if there is a 
primitive recursive function - - , x,) such that 


(1) = , = 0. 


We can without loss of generality restrict x to take only 0 and 1 as values, 
and call it in this case the representing function of P. 

Under classical interpretations, which give a dichotomy of propositions 
into true and false, we can assign to any predicate P a representing function x 
which has 0 or 1 as value according as the value of P is true or false; and then 
say that P is primitive recursive if x is. 

2. General recursive functions. We shall proceed to the Herbrand-Gédel 
generalization of the notion of recursive function. We start with a preliminary 
account, certain features of which we shall then restate carefully. 

The way in which the function @ is defined from the given functions in 
an application of one of the primitive recursive schemata amounts to this: 
the values $(x1, - - - , x.) of @ for the various sets x1, - - - , x, of arguments 
are determined unambiguously by the equations and the values of the given 
functions, using only principles of determination which we can formalize as 
a substitution rule and a replacement rule. 

The formalization presupposes suitable conventions governing the sym- 
bolism, which are easily supplied. In particular, we must distinguish between 
the variables for numbers and the numerals, that is the expressions for the 
fixed numbers in terms of the symbols for 0 and the successor operation ’. 
The rules are the following. 


_R1: to substitute, for the variables x1,--+, Xn, of an equation, numerals 
Xi, °**, Xn, respectively. 


R2: to replace a part f(x:, - - - , Xn) of the right member of an equation by x, 
where { is a function symbol, where x1, +--+, Xn, X are numerals, and where 
f(x1, Xn) =x 4s given equation. 


By a given equation f(x, -- - , x,) =x for R2, we mean an equation ex- 
pressing one of the values of one of the given functions for the schema 
application, or an equation of this form already derived by R1 and R2 from 
the equations of the schema application. 


44 S. C. KLEENE { [January 


Now let us consider any operation or schema, for the definition of a func- 
tion in terms of given functions, which can be expressed by a system of equa- 
tions determining the function values in this manner. In general the equations 
shall be allowed to contain, besides the principal function symbol which repre- 
sents the function defined, and the given function symbols which represent 
the given functions, also auxiliary function symbols. The given function sym- 
bols shall not appear in the left members of the equations. Such a schema we 
shall call general recursive. 

A function ¢ which can be defined from given functions y,, - + -, ~, by a 
series of applications of general recursive schemata we call general recursive 
in the given functions; and in particular, a function @ definable ab initio by 
these means we call general recursive. 

Suppose that a function @ is defined, either from given functions 

¥1, °°**, We or ab initio, by a succession of general recursive operations. 
Let us combine the successive systems of equations which effect the defini- 
tion into one system, using different symbols as principal and auxiliary func- 
tion symbols in each of the successive systems, and in the resulting system 
-considering as auxiliary all of the function symbols but that representing ¢ 
and those representing ¥i, --~-, Wx. The restriction imposed on a general 
recursive schema that the given function symbols should not appear on the 
left will prevent any ambiguity being introduced by the interaction under 
R1 and R2 of equations in the combined system which were formerly in sepa- 
rate systems. Thus the definition can be considered as effected in a single 
general recursive operation. 

In particular, any general recursive function can be defined ab initio in 
one operation, so that in the defining equations there are no given function 
symbols and what we have called the given equations for an application of R2 
must all be derivable from the defining equations by previous applications 
of Ri and R2. For the formal development, it is convenient to adopt the con- 
vention that the principal function symbol shall be that one of the function 
symbols occurring in the equations of the system which comes latest in a 
preassigned list of function symbols. The function is then completely de- 
scribed by giving the system of defining equations. 

We now restate the definition of general recursive function from this point 
of view. 

A function $(x1, - - +, %,) is GENERAL RECURSIVE, if there is a sys- 
tem E of equations which defines it recursively in the following sense. A 
system E of equations defines recursively a GENERAL RECURSIVE func- 
tion of m variables if, for each set x1, - - - , x, of natural numbers, an equation 
of the form f(x:, - - - , x.) =x, where f is the principal function symbol of E, 
and where x:, ++, X, are the numerals representing the natural numbers 
%1, °° * , Xn, is derivable from E by R1 and R2 for EXACTLY one numeral x. 
The function of » variables which is defined by E in this case is the func- 


1943] PREDICATES AND QUANTIFIERS 45 


tion ¢, of which the value $(x1, - - - , Xn) for x, - , x, as arguments is THE 
NATURAL NUMBER x REPRESENTED BY THE NUMERAL x. 

A predicate P(x1, - - - , X») is general recursive, if there is a general recur- 
sive function m(x1, - - - , X,) taking only 0 and 1 as values such that (1) holds; 
in this case, m is called the representing function of P. (Or, if we introduce the 
representing function w first, P is general recursive if 7 is.) 

3. The yu-operator. Consider the operator: uy (the least y such that). If 
this operator is applied to a predicate R(x, - - - , Xn, y) of the +1 variables 
*1,°***, Xn, y, and if this predicate satisfies the condition 


we obtain a function wyR(x, ---, xn, y) of the remaining m free variables 


1, °° * 
Thence we have a new schema, 


for the definition of a function ¢ from a given function p which satisfies the 
condition 


(3) (a1) (n)(Ey) [p(x1, ta» y) = 


We now show that this schema, subject to the condition on p, is, like 
(I)-(V), general recursive. For this purpose, we rewrite it in terms of equa- 
tions, using an auxiliary function symbol “co”: 


(VI2) o(z’, 41, °** » y) = o(p(*1, y’)s » Xn, y’) 


Assuming the values of p, these equations will lead us to the values of ¢ as 
defined by (VI;), and to only those values, as follows. 

Consider informally any fixed set of values of x:, - - - , x, (formally, this 
means to substitute the corresponding set of numerals for the variables 
“x”, - ++, “x,”). We seek to obtain the corresponding value of (x1, - - + , Xa) 
by replacements on the third equation, and this is the only possibility we 
have for obtaining that value under the two principles. First we can replace 
p(x1, - - + , Xn, 0) by its value, and this is the only first replacement step pos- 
sible on that equation. According as that value is 0 or is not 0, we seek the 
value of ¢ for the next replacement step from the first or second of the equa- 
tions, and this is the only possible source for the next replacement value. In 
the first case, we obtain 0 as that value; in the second, we use the value of 
p(x1, ++, Xn, 1) in the second equation, and then seek another value of o. 
We continue thus, with no choice in the procedure at any stage. The first case 


46 S. C. KLEENE [January 


is first encountered when we come to use the value of p(x1, - - - , xn, y) for the 
first y for which that value is 0, and hence certainly for at most the y given by 
(3). When this happens, we can complete the pending replacements to obtain 
that y as the value of $(x1, - - - , X,). Thus we get the intended value; and 
because we had no choice at any stage of the procedure, we can get no other 
value. 

The general recursiveness of the new schema is thus established. Hence, if 
R(x1, « + + , Xn, y) is a general recursive predicate and (2) holds, by taking as p 
the representing function of R, we can conclude that wyR(m, - ++, Xa, y) is 
a general recursive function. 

What can we conclude if (2) is not assumed to hold? In this case, 
uyR(x1,--*, Xn, Y) may not be completely defined as a function of the 
variables x:, - - - , x,; but for any fixed set of values of x1, - - - , X,, the se- 
quence of steps by which we attempt to determine a value for ¢(x1, - - - , Xa) 
from the equations remains as described for the preceding case, only with 
now the matter of its termination in doubt. If (Zy)R(x, - +--+, xn, y) does 
hold for that set of values of x1, - - - , X,, then it does terminate as described, 
with wyR(x, +--+, Xn, y) as the value; while conversely, if it does termi- 
nate, this can only be in consequence of a 0 being encountered among 
the values of Xa, y), so that (Ey)R(x, xa, y) does hold, 
and pyR(x, - + +, Xn, y) is the value. 

Hence, in formal terms, if F is the system of equations obtained by ad- 
joining, to any system E which defines p recursively, equations of the form 
(VI;), with the notation so arranged that “@” becomes the principal function 
symbol f, then: an equation of the form f(x:, - - - , x.) =x, where x1, ++ + , Xn 
are the numerals representing the natural numbers x, - - + , X,, and where 
x is a numeral, is derivable from F by R1 and R2 if and only if 
(Ey)R(x1, + + ¥)- 

4. The enumeration theorem. We introduce a metamathematical predi- 
cate ©, (for each particular m) as foilows. 


S,(Z, x1, +++, Xn, Y): Z ts a system of equations, and Y is a formal deduc- 
tion from Z by R1 and R2 of an equation of the form f(x, - « + , Xn) =x, where f 
is the principal function symbol of Z, where xi, - - - , X, are the numerals repre- 
senting the natural numbers x1, - - - , Xn, and where x is a numeral. 


With this notation, we can state the last result of the preceding section 
symbolically : 


(4) (Ey) R(x, y) = (EY)S,(F, ** Y). 


From a like exploration of the possibility that the sequence of steps does not 
terminate, or simply from (4) by contraposition, we have also: 


3) 
u 
vig 


1943] PREDICATES AND QUANTIFIERS 47 


Using Gédel’s idea of arithmetizing metamathematics(*), suppose that 
natural numbers have been correlated to the formal objects, distinct numbers 
to distinct objects. The metamathematical predicate S,(Z, x1, - - - , Xn, Y) is 
carried by the correlation into a number-theoretic predicate S,(z,%1, - -,Xn,¥), 
the definition of which we complete by taking it as false for values of 2, y not 
both correlated to formal objects. 

For a suitably chosen Gédel numbering, we can show, with a little trouble 
that S, is primitive recursive. 

Now (4) translates under the arithmetization into 


(6a) (Ey)R(x1,-- ¥) = (Ey)Sa(f, 1, ¥) 
with f as the Gédel number of the system of equations F. The formula 


is obtained likewise from (5), after changing the notation so that R is inter- 
changed with R. 

In stating these results for reference, we shall go over from S, to a new 
predicate T,, which entails no present disadvantage and proves to be of con- 
venience in some further investigations(’). The predicate 7, is defined from 
S, as follows. 


By a theorem of Gédel(*), the primitive recursiveness of T, follows from that 
of S,. The formulas (6) and (7) in the theorem follow from (6a) and (7a) by 
the definition of T,, in terms of S,. 


THEOREM I. Given a general recursive predicate R(x1, - - + , Xn, ¥), there are 
numbers f and g such that 


(6) (Ey) 5 9) = (Ey)T., * y), 


Now (Ey)T,(z, x:,+-+, Xn, y) is a fixed predicate of the form 
(Ey)R(z, x1, - + - , Xn, y) where R is general recursive (in fact, as it happens, 
primitive recursive). By the theorem, if we take successively z=0, 1, 2,---, 
we obtain an enumeration (with repetitions) of all predicates of the form 
(Ey) R(x1, - , xn, y) where R is general recursive(*). Likewise, the theorem 
gives us a fixed predicate of the form (y)R(z, x1, - - - , Xa, y) where R is gen- 
eral recursive which enumerates all predicates of the form (y) R(x, , Xn, ¥) 


(*) Gédel [1]. 

(7) A revision, April 13, 1942. 

(*) Gédel [1, IV]. 

(*) This result entered partly into the last theorem of Kleene [2], but the advantage of 
using it at an earlier stage was overlooked. In anticipation, we may remark that XI-XVI of that 
paper are essentially special cases of Theorem II below (with now a constructive proof for XVI). 


48 S. C. KLEENE . [January 


where R is general recursive. These enumerations form the basis for the ap- 
plication of Cantor’s diagonal method in the next section. 

5. The general theorem. By a familiar rule of classical logic, in each of 
the following pairs of propositions (with a fixed R for a given pair), 


(Ex)R(x) (x)(Ey)R(x,y) y, 2) 
(x)R(x)  (Ex)(y)R(x, (x)(Ey)(z)R(x, y, 2) 


either member is equivalent to the negation of the other. Hence we may assert 
non-equivalence between the members of the pair. This argument is not good 
in the intuitionistic logic. However, the non-equivalence for the case of one 
quantifier, 


(8) (Ex)R(x) # (x)R(z), 


does hold good intuitionistically. 

Consider the predicate form (x)R(a, x) where R is general recursive. This 
gives a particular predicate of the variable a, whenever we specify the general 
recursive predicate R(a, x) of two variables. In particular, (x)T,(a, a, x) is a 
predicate of this form. 

We shall show that this predicate is neither general recursive nor express- 
ible in the form (Ex)R(a, x) where R is general recursive. 

For this purpose, suppose we have selected any particular general recur- 
sive R(a, x), giving a particular predicate of the latter form. By (6), there is 
for this R a number f such that 


(9) (Ex)R(a, x) = (Ex)T,(f, a, x). 
Substituting the number f for the variable a, 
(10) (Ex)R(f, x) = (Ex)TiA(, f, x). 
By (8), 
(11) (Ex)Ti(f, f, x) («)Tilf, f, x). 
Combining (10) and (11), 
(12) (Ex)R(f, x) (x)Ti(/f, f, x). 
This refutes, for a=f, the equivalence of (Ex)R(a, x) to (x)Ti(a, a, x). Since 
this refutation can be effected, whatever general recursive R we chose, for 
some f depending on the R, the predicate (x)7i(a, a, x) is not expressible in 
the form (Ex)R(a, x) where R is general recursive. 

A fortiori, (x)T,(a, a, x) is not expressible in the form R(a) where R is 
general recursive. For were it so expressed, we should then have it in the form 
(Ex)R(a, x) where R is general recursive, by taking as R(a, x) the predicate 


R(a) & x=x. 
This completes the proof of one case of the next theorem. 


1943] PREDICATES AND QUANTIFIERS 49 


For another case, consider the predicate form (Ex)R(a, x) where R is gen- 
eral recursive. We can show similarly, using (7) instead of (6), that the predi- 
cate (Ex)T;(a, a, x), which has this form, is neither general recursive nor ex- 
pressible in the form (x)R(a, x) where R is general recursive. 

To illustrate the treatment of a case with more than one quantifier, con- 
sider the predicate form (x)(Ey)(z)R(a, x, y, 3) where R is general recursive. 
The predicate (x)(Ey)(z)7T3(a, a, x, y, 2) has this form. Select any particular 
general recursive R(a, x, y, z). By (6), for some f depending on this R, 


(13) (Ez)R(a, x, y, 2) = (Ez)T3(f, a, x, y, 2). 
By corresponding quantifications of these equivalent predicates, 
(14) (Ex)(y)(Ez)R(a, x, y, 2) = (Ex)(y)(Ez)Ts(f, a, x, 9, 2). 


Classically, we can complete the argument as before, showing that 
(x)(Ey)(z)7T3(a, a, x, y, 2) is not expressible in any of the forms 


(Ex)R(a,x) (x)(Ey)R(a,x,y)  (Ex)(y)(Ez)R(a, x, y, 2) 
(x)R(a, x)  (Ex)(y)R(a, x, 


where the R for the form is general recursive. 

To obtain an alternative phrasing of the theorem, in which it holds for all 
cases intuitionistically, we may omit in the classical proof the step which 
interchanges the two kinds of quantifiers under the operation of nega- 
tion. We thus show that the predicates (Ex)Ti(a, a, x), (x)Ti(a, a, x), 
(Ex)(y)(Ez)T+(a, a, x, y, 2), and so on, are neither expressible in the respective 
forms (Ex)R(a, x), (x)R(a, x), (Ex)(y)(Ez)R(a, x, y, 2), and so on, where R 
is general recursive, nor in any of the forms with fewer quantifiers. 


THEOREM II. Classically, and for the one-quantifier forms intuitionistically : 
To each of the forms 


(Ex)R(a, x) (x)(Ey)R(a, x, y) (Ex)(y)(Ez)R(q, x, y, 2) 
(x)R(a, x) (Ex)(y)R(a, x,y) ()(Ey)(z)R(a, x, y, 2) 
where the R for each is general recursive, after the first, there is a predicate ex- 
pressible in that form but not in the other form with the same number of quantifiers 
nor in any of the forms with fewer quantifiers. 
Classically, and intuitionistically: To each of the forms, after the first, there 
4s a predicate expressible in the negation of that form but not in that form itself 
nor in any of the forms with fewer quantifiers. 


R(a) 


R(a) 


For simplicity, we have given the theorem for predicates of one variable a, 
but it holds: 

Likewise, replacing the variable a throughout by n variables a;, - - - , Gn, for 
any fixed positive integer n. 


50 S. C. KLEENE . [January 


By an elementary predicate, we shall mean one which is expressible in terms 
of general recursive predicates, the operations &, \/, —, —, = of the proposi- 
tional calculus, and quantifiers. 

Suppose given an expression for a predicate in these terms. By the classical 
predicate calculus, we can transform the expression so that all quantifiers 
stand at the front. For each m, let (x):, - - -, (*)m be a set of m primitive 
recursive functions of x which as a set ranges, with or without repetitions, over 
all m-tuples of natural numbers, as x ranges over all natural numbers (such 
sets of functions are known). The equivalences 


enable us to eliminate consecutive occurrences of like quantifiers. These trans- 
formations leave as operand of the prefixed quantifiers a general recursive 
predicate of the free and bound variables. Hence, classically, the predicate 
forms listed in the theorem for a given m suffice for the expression of every 
elementary predicate of m variables. 

The theorem then says that no finite sublist of the forms would suffice. 

Classically, we are led to a classification of the elementary predicates ac- 
cording to the minimum numbers of quantifiers which would suffice for their 
expression in terms of general recursive predicates and quantifiers. 

The analogy between the logical operations of existential and universal 


quantification and geometrical operations of projection and intersection, re- 
spectively, is well known('*). The possibility of a connection between present 
results and theories of Borel and Baire is suggested(""). 


II. PRIMITIVE, GENERAL, AND PARTIAL RECURSIVE 
PREDICATES UNDER QUANTIFICATION 


6. Partial recursive functions. The author’s definition of partial recursive 
function extends the Herbrand-Gédel definition of general recursive function 
to functions ¢ of m variables which need not be defined for all 2-tuples of 
natural numbers as arguments, retaining the characteristic of that definition 
with respect to each n-tuple for which the function is defined(). The partial 
recursive functions include the general recursive functions as those which are 
defined for all sets of arguments. 

For a more complete description, take the definition of general recursive 
function which is given at the end of §2, and replace the four capitalized 
phrases by the following, respectively: PARTIAL RECURSIVE; PARTIAL 
RECURSIVE; AT MOST; THE NATURAL NUMBER «x REPRE- 


(#*) In particular, it has been discussed by Tarski. 
(4) This suggestion was made to the author by Gédel and by Ulam. 
(2) Kleene [4]. 


i 
Ps 
4 

. 


1943] PREDICATES AND QUANTIFIERS 51 


SENTED BY THE NUMERAL x IF THAT NUMERAL EXISTS, AND 
IS OTHERWISE UNDEFINED. 

In dealing with functions which may not be completely defined, we inter- 
pret the equation $(x1, - -,x,) =W(x1, - - - , X,) as the assertion that ¢ and 
have the same value for x:, - - - , x, as arguments, taking it as undefined (non- 
significant) if either value is undefined. We write @(x1, - - -,%n)-~W(%1, - + -, Xn) 
to express the assertion that, if either of ¢ and y is defined for the arguments 
X1, ° * * ,X,, the other is and the values are the same, and if either of @ and y 
is undefined for those arguments, the other is. 

Similarly, in dealing with predicates which may not be completely defined, 
P(x1, ++, Xn.) =Q(x1, - - , X,) expresses equivalence of value, and is unde- 
fined if the value of either member is undefined; while P(m,---, x) 
>OQ(x1, ---, X,) expresses that the definition of either implies mutual defi- 
nition with equivalence, and the indefinition of either implies mutual in- 
definition. 

A predicate P(x, - - - , X,) not necessarily defined for all m-tuples of natu- 
ral numbers as arguments is partial recursive, if there is a partial recursive 
function 4(x1, - + - , X,) taking only 0 and 1 as values such.that 


(17) P(x, Xn) SS Xn) = 0; 


in this case, 7 is called the representing function of P. (Or if we first introduce 
a representing function x of P, the value of which is to be 0, 1, or undefined 
according as the value of P is true, false, or undefined, then P is partial re- 
cursive if 7 is.) 

In §§2, 3, we remarked the general recursiveness of Schemata (I)—(VI) 
with (VI) subjected to the condition (3) ; and we also considered Schema (VI) 
for the case that p is general recursive but (3) is not required to hold. The 
method of those sections applies equally well without the restrictions; in ex- 
planation of the schemata when the given functions may not be completely 
defined or (3) not hold for (V1), it wi!l suffice here to remark that the condi- 
tions of definition for the functions introduced by the schemata may be in- 
ferred a posteriori from the metamathematical results. 


THEOREM III. The class of general recursive functions is closed under appli- 
cations of Schemata (1)-(VI1) with (3) holding for applications of (V1). 

The class of partial recursive functions is closed under applications of 
Schemata (1)-(VI). 


Coro.uary. Every function obtainable by applications of Schemata GI ) 
with (3) holding for applications of (V1) is general recursive. 

Every function obtainable by applications of Schemata (1)—(V1) is partial re- 
cursive. 


7. Normal form for recursive functions. We shall pursue a little further 


BOSTON UNIVERSITY 
COLLEGE OF LIBERAL ARTS 
LIBRARY 


| 

{ 

| 

\ 


52 S. C. KLEENE : [January 


the method of §4 to obtain the converse of this result. Besides the metamathe- 
matical predicate S,, we now require a metamathematical function as follows. 


U(Y): the natural number x which the numeral x represents, in case Y is a 
formal deduction of an equation of the form t=x, where x is a numeral and t ts 
any term; and 0, otherwise. 


According to the definition of general recursive function, if ¢ is a general 
recursive function of m variables, there is a system E of equations such that 


(18) (41) (%n)(EY)S,(E, +, ¥), 

and the function ¢(x1, - - - , X,) can be expressed in terms of E thus 
(20) tn) = U(UYS,(E, x, tn, Y)), 


if we understand the formal objects to be enumerated in some order, so that 
the operator u can be applied with respect to the metamathematical varia- 
ble Y; we may take the order to be that of the corresponding Gédel numbers. 

If ¢ is a partial recursive function of m variables, instead of asserting (18), 
we can write 


(EY)S,(E, * Y) 


as the condition on x, cee , x, that the function be defined for x1, - ++, Xn 
as arguments; we have (19), taking the implication to be true whenever the 
first member is false, irrespective of the status of the second member; and 
our convention calls for rewriting (20) thus, 


(21) Xn) ~ UuYS,(E, %1,°°** Y)), 


in order that it be true (and not sometimes undefined) for all values of 

By the Gédel numbering already considered, the metamathematical func- 
tion U(Y) is carried into a number-theoretic function U(y), the definition of 
which we complete by taking the value to be 0 for any y not correlated to a 
formal object. If the Gédel numbering was suitably chosen, U as well as S, 
is primitive recursive. 

Now (20), (18) and (19) in terms of ©, and WU are carried into formulas 
of like form in terms of S, and U. On passing over from S, to T,, we then 
have the (22), (23) and (24) of the theorem(#*). The part of the theorem which 
refers to a partial recursive function is obtained similarly. 


“THEOREM IV. Given a general recursive function (x1, - - + , Xn), there is a 


(8) Kleene [2, IV], with some changes in the formulation. The present S, corresponds to 
the former 7,, using the Gédel numbering of proofs instead of the enumeration of provable 
equations. 


if 
it 


1943] PREDICATES AND QUANTIFIERS 


number e such that 

(22) , tn) = U(uyT ale, 

(23) (x1) > (%n)(Ey)Tale, ¥)s 


Given a partial recursive function $(x1, - + + , Xn), there is a number e such 
that 


(25) (x1, Xn) ~ U(uyT,(e, » Xn, y)), 


where 
(Ey)T,(e, °° * y) 


ts the condition of definition of the function, and (24) holds. 


Thus any general recursive function (any partial recursive function) is 
expressible in the form ¥(uyR(m, - + + , Xa, y)) with (2) holding (in the form 
v(uyR(x1, - +, Xn, ¥))) where and R are primitive recursive. Hence: 


CorROLLARY. Every general recursive function is obtainable by applications 
of Schemata (1)—(VI1) with (3) holding for applications of (V1). 

Every partial recursive function is obtainable by applications of Schemata 
(I)-(VI). 


Formula (25) contains the substance of the theorem. For it implies the 
condition of definition of the function; and, in the case that $(x1, - + - , Xn) 
is defined for all sets of arguments, it gives (22) and (23). Moreover by the 
definition of 7, in terms of S,, it implies (24). 

We say that e defines recursively, or e is a Gédel number of 9, if (25) 
holds(), in which case ¢ has all the properties in relation to ¢ which are 
specified in the theorem. 

It is here that the advantage of using 7, instead of S, appears. A number e 
which satisfies 6(x1, x1, » ¥)) (which is equivalent 
to (25)) does not necessarily satisfy - - (%n)(y)[Sa(e, x1, °° Xa» 9) 
—U(y) =¢(x1, + + + , Xn) ]. While we could get around the difficulty by impos- 
ing the latter as an additional condition on the Gédel numbers, it is more con- 
venient simply to use 7, instead of S,. (On the basis of Theorem III and the 
results which we had in terms of S, before passing over to 7,, one can set 
up a primitive recursive function V such that, if e satisfies (25), then V(e) has 
all the properties in terms of S,.) 

The numbers f and g for Theorem I can be described now as any numbers 
which define recursively the partial recursive functions wyR(x1, + - - , Xn, ¥) 
and uwyR(x, -- +, Xa, ¥), respectively. 

(4) Kleene [2, Definition 2c, p. 738] and [4, top p. 153]. We have now also the changes in 
the formulation of Theorem IV. 


54 S. C. KLEENE [January 


8. Consistency. Let us review the arguments used in proof of Theorems I 
and III. For rigor, these have to be put in metamathematical form. Let E 
be the system of equations associated with a series of applications of Schemata 
(1)-(VI1). We shall review only the case that no given function symbols occur 
in E. 

In general, we easily establish that, for each of certain sets x1, - -- , Xn 
of natural numbers, an equation of the form f(x, - - - , x,) =x, as described 
in the definitions of general and partial recursive function, is derivable from 
E by R1 and R2. In particular, if we are proving that E defines a general 
recursive function, we must show this for all x, - - - , x,; if we have a prior 
interpretation of the schemata applications as definition of a (partial or com- 
plete) function - - - , or require that E define a - - - , x.) al- 
ready known to us in some other manner, we must show this for all x;, - - -, x, 
belonging to the range of definition of ¢, and also show that the x in the 
equation is the numeral representing the value of @ for x, - - - , x, as argu- 
ments. This property of the equations E and rules R1 and R2, the precise 
formulation of which depends on the circumstances, we call the “complete- 
ness property.” (When we wish merely to show that E defines a partial recur- 
sive function, the function to be determined a posteriori from E, no complete- 
ness property is required.) 

The second part of the discussion consists in showing that an equation 
of the described form f(xi, - - - , X,) =x is derivable from E for at most one 
numeral x; or if we have already established completeness in one of the above 
senses, that the equations f(x:, - - - , x,) =x referred to in the discussion of 
completeness, for various x1, - + +, X,, are the only equations of that form 
which are derivable from E by R1 and R2. This we call the “consistency 
property.” 

As we indicated in §2, it suffices to handle each of the schemata ir turn, 
assuming equations for use with R2 which give the values of the given func- 
tions. The argument for consistency which we sketched in §3 for Schema (V1) 
applies as well to the other schemata. For Schema (IV) there is indeed a 
choice in the order in which the values of the several x’s are introduced, but 
it is without effect on the final result. 

This very easy consistency proof was gained by restricting the replace- 
ment rule so that replacement is only performable on the right member of an 
equation, a part f(x:, - - -., X.) where f is a function symbol and x, - - - , Xs 
are numerals being replaced by a numeral x. This eliminates the possibility 
of deriving an equation of the form g(y1, - - - , ¥m) =y, where g is a fixed func- 
tion symbol, y:, - - - , ¥m are fixed numerals, and y is any numeral, along es- 
sentially different paths within the system, and therewith the possibility that 
such an equation should be derivable for different y’s. 

In some previous versions of the theories of general and partial recursive 
functions, the replacement rule was not thus restricted. The consistency proof 


1943] PREDICATES AND QUANTIFIERS . 55 


which we gave in the version with the unrestricted replacement rule was based 
on the notion of verifiability of an equation(). This notion makes presup- 
position of the values of the functions, and for the theory of partial recursive 
functions also of the determinateness whether or not the values are defined. 
In the latter case, it is not finitary. To give a constructive consistency proof 
for the theory of partial recursive functions with the stronger replacement 
rule seems to require the type of argument used in the Church-Rosser con- 
sistency proof for A-conversion('*), and in the Ackermann-von Neumann 
consistency proof for a certain part of number theory in terms of the Hilbert 
€-symbol(?’). 

It is easily shown, by using the method of proof of Theorem IV to obtain 
the same normal form with the stronger replacement rule, that every function 
partial recursive under the stronger replacement rule is such under the weaker. 

Thus we find the curious fact that the main difficulty in showing the equiv- 
alence of the two notions of recursiveness comes in showing that the stronger 
rule suffices to define as many functions as the weaker. This is because the 
consistency of a stronger formalism is involved. The consistency of that for- 
malism is of interest on its own account, but is extraneous for the theory of 
recursive definition, including the applications corresponding to those of 
Church in terms of the A-notation which presuppose the complicated Church- 
Rosser consistency proof. All that is required for the theory of recursive 
definition is some consistent formalism sufficient for the derivation of the 
equations giving the values of the functions. 

To this discussion we may add several supplementary remarks. We might 
in practice have a system E of equations and a method for deriving from E 
by R1 and the strong replacement rule, for all and only the m-tuples of a 
certain set, an equation of the form f(x, - - - , x.) =x with a determinate x, 
but lack the knowledge that unlimited use of the two rules could not lead to 
other such equations. In this situation, a function is defined intuitively for 
the n-tuples of the set, and undefined off the set. If we can characterize meta- 
mathematically our method of applying the two rules, we shall obtain a 
limited formalism known to be consistent, and the method used in establish- 
ing Theorem IV can then be applied to obtain equations defining the function 
recursively with the weak replacement rule. 

For some types of equations which define a function recursively with the 
strong replacement rule (consistency being known), a more direct method 
may be available for obtaining a system defining the function recursively 
with the weak replacement rule. For example, consider (in informal lan- 
guage) the equation ¢(y¥(x)) = x(x). To use this in deriving equations giving 
values of ¢, we need to introduce values of y by replacement on the left. After 

(4) Kleene [2, p. 731] and [4, §2, the bracketed portion of the fifth paragraph]. 

(*) Church and Rosser [1]. 

(#7) Hilbert and Bernays [1, §2, part 4, pp. 93-130, and Supplement II, pp. 396-400]. 


‘ 
‘ 
> 


56 S. C. KLEENE ' [January 


expressing the equation in the form ¢$(y) = (uw[y((w)1) =y & x((w)1) =(w)2]})s, 
and separating the latter into a series of equations without the u-symbol by 
the method which the theory of the schemata affords, replacement will be re- 
quired only on the right. This device is applicable to any equation of which 
the left member has the form f(gi(x:, - , Xn), » &m(X1, * * Xn)). 

The precise form of the restriction which is used to weaken the replace- 
ment rule is somewhat arbitrary, so long as it accomplishes its purpose of 
channelling the deductions of equations giving the values of the functions. 
The restriction as it was stated in the early Gédel version is now simplified, 
since we need to consider only equations having the forms appearing in the 
six schemata. Gédel provided for equations the left members of which could 
have the form f(gi(x:, - - , Xn), * * &m(X1, - Xn)) where f is the prin- 
cipal function symbol and gi, - - - , gm are given function symbols, and there- 
fore allowed replacement on the left in the case of the g’s. 

9. Predicates expressible in both one-quantifier forms. By Theorem IV, 
for any general recursive predicate P(x:, - + + , Xn), 


(26) P(x, > Xn) = (Ey) [T.(e, » y) & U(y) 0], 
(27) , = (y) [T.(e, ¥) U(y) = 0], 


where e is any Gédel number of the representing function of P. 

Conversely, suppose_that for a predicate P both P(x,---, Xn) 
=(Ey)R(x1, y) and P(x, +++, Xn) =(y)S(x1, +--+, Xa, y) where R 
and S are general recursive. From the second of these equivalences, under 
classical interpretations, -- , x.) =(Ey)S(x1, + , Xn, y). By the clas- 
sical law of the excluded middle, (Ey) [R(x1, , xn, +, Xn, ¥) 
Therefore 


where the second member is general recursive by Theorem III. 


THEOREM V. Every general recursive predicate P(x1, «+ - , Xn) 1s expressible 
in both of the forms (Ey)R(x1, - + + , Xn, y) and (y)R(x1, + + + , Xn, y) where the R 
for each is primitive recursive. Under classical interpretations, conversely, every 
predicate expressible in both of these forms where the R for each is general recur- 
sive is general recursive. 

Now consider any predicate expressible in one of the forms of Theorem II 
after the first. According as the innermost quantifier in this form is existential 
or universal, we can apply (26) or (27), and then absorb the extra quantifier 
by (15) or (16), respectively, to obtain the original form but with a primitive 
recursive R. For example, 

(x)(Ey)R(a, x, y) = (x)(Ey:)(Eys) (Tale, a, yx, ¥2) & U(ys) = 0] 
= (x)(Ey) [Ts(e, a, %, (y)1, (y)2) & U((y)2) 0}. 


by 
ip 


1943] PREDICATES AND QUANTIFIERS 57 


Coro.iary. The class of predicates expressible in a given one of the forms of 
Theorem II after the first (for a given n variables) is the same whether a primitive 
recursive or a general recursive R be allowed. 


This generalizes the observation of Rosser that a class enumerable by a 
general recursive function is also enumerable by a primitive recursive func- 
tion (#8). 

The formulas for the one-quantifier cases are 
(Ey) R(x, y) 

= (Ey) (y)1, (y)2) & U((y)2) 0], 


(y)R(a1,-- ¥) 
= (y)2) > U((y)2) = 0), 


where ¢ is any Gédel number of the representing function of R. These afford 
a new proof of the enumeration theorem of §4, with new enumerating predi- 
cates, and thence a new proof of Theorem II. 

10. Partial recursive predicates. Let P(x;, - - - , x,) be a predicate which 
may not be defined for all n-tuples of natural numbers as arguments. By a 
completion of P we understand a predicate Q such that, if P(x, ---, xn) 
is defined, then Q(x, - - -, x,) is defined and has the same value, and if 
P(x, - - + , Xn) is undefined, then Q(x, - - - , x,) is defined. In particular, the 
completion P+(x:, - - +, x.) which is false when P(x, - - - , x,) is undefined, 
and the completion P~(x, - - - , x,) which is true when P(x, - - - , X,) is un- 
defined, we call the positive completion and negative completion of P(x1, - + -, Xn), 
respectively. (In P and Pt, the “positive parts” coincide; in P and P-, the - 
“negative parts” coincide.) 

If P(x:, - + + , %,) is a partial recursive predicate, then by Theorem IV, 


(31) = (Ey) [7.(e, ¥) & U(y) = 0], 
(32) %n) = (y) [7..(e, * * y) U(y) 0], 


where ¢ is any Gédel number of the representing function of P. 
Conversely, if R(x, - - - , X,, y) is any general recursive predicate, then 
by Theorem III, 


(34) (y) R(x, y) = pyR( 21, y) wyR(x1, y). 


THEOREM VI. The positive completion Pt(x:, - - + , Xn) of a partial recursive 
predicate P(x1, - + + , Xn) ts expressible in the form (Ey)R(x:, - + + , Xn, y) where 
R is primitive recursive; and conversely, any predicate expressible in the form 
(Ey) +--+, Xn, where R is general recursive is the positive completion 
P(x, +--+, Xn) of a partial recursive predicate P(x1, - - , Xn). 


(8) Rosser [1, Lemma I, Corollary I, p. 88]. 


(29) 


(30) 


58 S. C. KLEENE P [January 


Dually, for negative completions P-(x:, +--+, Xn) and the predicate form 
(y)R(x1, y). 

It follows that, for the predicate forms of Theorem II which have an exist- 
ential quantifier (universal quantifier) innermost, we may, without altering 
the class of predicates expressible in that form, take R to be the positive com- 
pletion (negative completion) of a partial recursive predicate. 

Let us abbreviate U(uyT7,(z, x1, +, Xn, y)) as ®,(2, ++, Xn) (2). 
Then ®, is a fixed partial recursive function of +1 variables, from which any 
partial recursive function ¢ of m variables can be obtained thus (rewriting 


(25)), 
(35) $( 1, Xn) =~ #,(e, mee Xn) 


where é is any Gédel number of @. Since for a constant z, ®,(2, x1, + + , Xn) 
is always a partial recursive function of the remaining m variables, 
®,,(z, x1, , X,) therefore gives for z=0, 1, 2, - - - an enumeration (with 
repetitions) of the partial recursive functions of m variables. It follows that 
®,,(Z, x1, , X,) =O is a partial recursive predicate of »+1 variables which 
enumerates (with repetitions) the partial recursive predicates of variables. 

This, seen in the light of Theorem VI, has as consequence the enumeration 
theorem of §2 (with other enumerating predicates), and thence by Cantor’s 
diagonal method Theorem II. 

Elsewhere, the enumeration theorem for partial recursive functions gave 
by Cantor’s diagonal method what may be called the fundamental theorem 
for proofs of recursive definability(°). 

This fundamental theorem, and the existence of partial recursive func- 
tions and predicates, no completions of which are general recursive(?"), are 
what occasioned the introduction of the notion of a partial recursive function. - 


III. INCOMPLETENESS THEOREMS IN THE FOUNDATIONS 
OF NUMBER THEORY 


11. Introductory remarks. We entertain various propositions about natu- 
ral numbers. These propositions have meaning, independently of or prior to 
the consideration of formal postulates and rules of proof. We pose the problem 
of systematizing our knowledge about these propositions into a theory of 
some kind. For certain definitions of our objectives in constructing the theory, 
and certain classes of propositions, we shall be able to reach definite answers 
concerning the possibility of constructing the theory. 

The naive informal approach which we are adopting may be contrasted 


(*) Using the notation of Kleene [4, bottom p. 152], but with the changes in the formula- 
tion of Theorem IV. 

(9) Kleene [4, the last result in §2]. 

(#*) Kleene [4, Footnote 3]. 


4 


1943] PREDICATES AND QUANTIFIERS 59 


with that form of the postulational approach which consists in first listing 
formal postulates, which are then said to define the content of the theory 
based on them. In the case of number theory, the formal approach cannot 
render entirely dispensable an intuitive understanding of propositions of the 
kind which we commonly interpret the theory to be about. For the explicit 
statement of the postulates and characterization of the manner in which they 
are to determine the theory belong to a metatheory on another level of dis- 
course; and the ultimate metatheory must be an intuitive mathematics un- 
regulated by explicit postulates, and having the essential character of num- 
ber theory. 

Of course the informality of our investigation does not preclude the enu- 
meration, from another level, of postulates which would suffice to describe it. 
Indeed, such regulation may perhaps be considered necessary from an intui- 
tive standpoint for that part of it which belongs to the context of classical 
mathematics. 

The propositions about natural numbers which we shall consider will con- 
- tain parameters. We shall thus have infinitely many propositions of a given 
form, according to the natural numbers taken as values by the parameters. 
In other words, we have predicates, for which these parameters are the inde- 
pendent variables. Generally, in a theory, a number of predicates are dealt 
with simultaneously; but for our investigations it will suffice to consider a 
theory with respect to some one predicate without reference to other predi- 
cates which might be present. Usually, we shall write a one-variable predicate 
P(a), though the discussion applies equally well to a predicate P(a1, - - - , dn) 
of » variables. 

12. Algorithmic theories. As one choice of the objective, we can ask that 
the theory should give us an effective means for deciding, for any given one 
of the propositions which are taken as values of the predicate, whether that 
proposition is true or false. Examples of predicates for which a theoretical 
conquest of this kind has been obtained are: a is divisible by b (that is, 
in symbols, (Ex)[a=dx]), ax+by=c is solvable for x and y (that is, 
(Ex)(Ey) [ax+by=c]). We shall call this kind of theory for a predicate 
a complete algorithmic theory for the predicate. 

Let us examine the notion of this kind of theory more closely. In setting 
up a complete algorithmic theory, what we do is to describe a procedure, 
performable for each set of values of the independent variables, which pro- 
cedure necessarily terminates and in such manner that from the outcome 
we can read a definite answer, “Yes” or “No,” to the question, “Is the predi- 
cate value true?” 

We can express this by saying that we set up a second predicate: the pro- 
cedure terminates in such a way as to give the affirmative answer. The second 
predicate has the same independent variables as the first, is equivalent to the 
first, and the determinability of the truth or falsity of its values is guaranteed. 


| 

| 

| 


60 S.C. KLEENE * [January 


This last property of the second predicate we designate as the property of 
being effectively decidable. 

Of course the original predicate becomes effectively decidable, in a deriva- 
tive sense, as soon as we have its equivalence to the second; extensionally, 
the two are the same. But while our terminology is ordinarily extensional, 
at this point the essential matter can be emphasized by using the intensional 
language. The reader may if he wishes write in more explicit statements re- 
ferring to the (generally) differing objects or processes with which the two 
predicates are concerned. 

Now, the recognition that we are dealing with a well defined process which 
for each set of values of the independent variables surely terminates so as to 
afford a definite answer, “Yes” or “No,” to a certain question about the man- 
ner of termination, in other words, the recognition of effective decidability in 
a predicate, is a subjective affair. Likewise, the recognition of what may be 
called effective calculabiliiy in a function. We may assume, to begin with, 
an intuitive ability to recognize various individual instances of these notions. 
In particular, we do recognize the general recursive functions as being effec- 
tively calculable, and hence recognize the general recursive predicates as be- 
ing effectively decidable. 

Conversely, as a heuristic principle, such functions (predicates) as have 
been recognized as being effectively calculable (effectively decidable), and 
for which the question has been investigated, have turned out always to be 
general recursive, or, in the intensional language, equivalent to general recur- 
sive functions (general recursive predicates). This heuristic fact, as well as 
certain reflections on the nature of symbolic algorithmic processes, led Church 
to state the following thesis(?*). The same thesis is implicit in Turing’s de- 
scription of computing machines(*). 


TuEsis I. Every effectively calculable function (effectively decidable predicate) 
is general recursive. 


Since a precise mathematical definition of the term effectively calculable 
(effectively decidable) has been wanting, we can take this thesis, together 
with the principle already accepted to which it is converse, as a definition of 
it for the purpose of developing a mathematical theory about the term. To 
the extent that we have already an intuitive notion of effective calculability 
(effective decidability), the thesis has the character of an hypothesis—a point 
emphasized by Post and by Church(*). If we consider the thesis and its con- 
verse as definition, then the hypothesis is an hypothesis about the application 
of the mathematical theory developed from the definition. For the acceptance 
of the hypothesis, there are, as we have suggested, quite compelling grounds. 


(#2) Church [1]. 
(*) Turing [1]. 
(*) Post [1, p. 105], and Church [2]. 


1943] PREDICATES AND QUANTIFIERS 61 


A full account of these is outside the scope of the present paper(*). We are 
here concerned rather to present the consequences. 

In the intensional language, to give a complete algorithmic theory for a 
predicate P(a) now means to find an equivalent effectively decidable predi- 
cate Q(a). It would suffice that Q(a) be given as a general recursive predicate ; 
and by Thesis I, if Q(a) is not so given, then at least there is a general recur- 
sive predicate R(a) equivalent to Q(a) and hence to P(a). Thus to give a 
complete algorithmic theory for P(a) means to find an equivalent general 
recursive predicate R(a), or more briefly, to express P(a) in the form R(a) 
where R is general recursive. This predicate form is the one listed first in 
Theorem II; and Theorem II gives to each of the other forms a predicate not 
expressible in that form. Thus, while under our interpretations there is a com- 
plete algorithmic theory for each predicate of the form R(a) where R is gen- 
eral recursive, to each of the other forms there is a predicate for which no 
such theory is possible. We state this in the following theorem, using the par- 
ticular examples for the one-quantifier forms which were exhibited in the 
proof of Theorem ITI. 


THEOREM VII. There exists no complete algorithmic theory for either of the 
predicates (Ex)T;(a, a, x) and (x)T(a, a, x). 


Of course, once the definition of effective decidability is granted, which 
affords an enumeration of the effectively decidable predicates, Cantor’s meth- 
ods immediately give other predicates. This theorem, as additional content, 
shows the elementary forms which suffice to express such predicates. 

Abstracting from the particular examples used here, the theorem is 
Church’s theorem on the existence of an unsolvable problem of elementary 
number theory, and the corresponding theorem of Turing in terms of his 
machine concept(**). The unsolvability is in the sense that the construction 
called for by the problem formulation, which amounts to that of a recursive 
R with a certain property, is impossible. The theorem itself constitutes solu- 
tion in a negative sense. 

13. Formal deductive theories. A second possibility for giving theoretic 
cohesion to the totality of true propositions taken as values of a predicate 
P(a) is that offered by the postulational or deductive method. We should like 
all and only those of the predicate values which are true to be deducible from 
given axioms by given rules of inference. To make the axioms and principles 
of inference quite explicit, according to modern standards of rigor, we shall 
suppose them constituted into a formal system (symbolic logic), in which 
the propositions taken as values of the predicate are expressible. Those and 
only those of the formulas expressing the true instances of the predicate 


(5) For a resume, see Kleene [4, Footnote 2], where further references are given. 
Turing [1, §8]. 


62 S. C. KLEENE - [January 


should be provable. We call this kind of theory for a predicate P(a) a com- 
plete formal deductive theory for the predicate. 

This type of theory should of course not be confused with incompletely 
formalized axiomatic theories, such as the theory of natural numbers itself 
as based on Peano’s axioms. 

It is convenient in discussing a formal system to name collectively as the 
“postulates” the rules describing the formal axioms and the rules of inference. 

Let us now examine more closely the concept of provability in a stated 
formal system. If the formalization does accomplish its purpose of making 
matters explicit, we should be able effectively to recognize each step of a 
formal proof as an application of a postulate of the system. Furthermore, if 
the system is to constitute a theory for the predicate P(a), we should be able 
effectively to recognize, to each natural number a, a certain formula of the 
system which is taken as expressing the proposition P(a). Together, these 
conditions imply that we should be able, given any sequence of formulas 
which might be submitted as a proof of P(a) for a given a, to check it, thus 
determining effectively whether it is actually such or not. 

Let us introduce a designation for the metamathematical predicate with 
which we deal in making this check, for a given formal system and predicate 
P(a). 

R(a, X): X ts a proof in the formal system of the formula expressing the 
proposition P(a). 


Then the concept of provability in the system of the formula expressing 
P(a), or briefly, the provability of P(a), is expressible as (EX)R(a, X). 

As we have just argued, the predicate R(a, X) should be an effectively 
decidable metamathematical predicate. Here the formal objects over which X 
ranges, if the notation of the system is explicit, should be given in some man- 
ner which affords an effective enumeration of them. Using the indices in this 
enumeration, or generally any effective Gédel numbering of the formal ob- 
jects, the metamathematical predicate R(a, X) will be carried into a number- 
theoretic predicate R(a, x), taken as false for any x not correlated to a formal 
object, which should then also be effectively decidable. By Thesis I, the effec- 
tive decidability of the latter implies its general recursiveness. We are thus 
led to state a second thesis. 


TuEsits II. For any given formal system and given predicate P(a), the predi- 
cate that P(a) is provable is expressible in the form (Ex)R(a, x) where R is gen- 
eral recursive. 


This thesis corresponds to the standpoint that the role of a formal deduc- 
tive system for a predicate P(a) is that of making explicit the notion of what 
constitutes a proof of P(a) for a given a. If a proposed “formal system” for 
P(a) does not do this, we should say that it is not a formal system in the 


1943} PREDICATES. AND QUANTIFIERS 63 


strict sense, or at least not one for P(a). Taken this way, the thesis has a 
definitional character. 

Presupposing, on the other hand, a prior conception of what constitutes a 
formal system for a given predicate in the strict sense, the thesis has the char- 
acter of an hypothesis, to which we are led both heuristically and from Thesis 
I by general considerations. 

Conversely, if a predicate of the form (Ex)R(a, x) where R is general re- 
cursive is given, it is easily seen that we can always set up a formal system 
of the usual sort, with an explicit criterion of proof, in which all true instances 
of this predicate and only those are provable. 

Using the thesis, and this converse, we can now say that to give a com- 
plete formal deductive theory for a predicate P(a) means to find an equiva- 
lent predicate of the form (Ex)R(a, x) where R is general recursive, or more 
briefly, to express the predicate in this form. By Theorem II, there are predi- 
cates of the other one-quantifier form, and of the forms with more quantifiers, 
not expressible in this form. Hence while there are complete formal deductive 
theories to each predicate of either of the forms R(a) and (Ex)R(a, x) where 
R is general recursive, to each of the other forms there is a predicate for which 
no such theory is possible. Specifically, using the one-quantifier example given 
in the proof of Theorem ITI: 


THEOREM VIII. There is no complete formal deductive theory for the predi- 
‘cate (x)Ti(a, a, x). 


This is the famous theorem of Gédel on formally undecidable proposi- 
tions, in a generalized form. A proposition is formally undecidable in a given 
formal system if neither the formula expressing the proposition nor the for- 
mula expressing its negation is provable in the system. Gédel gave such a 
proposition for a certain formal system (by a method evidently applying to 
similar systems), subject to the assumptions of the consistency and w-consist- 
ency of the system. Later Rosser gave another proposition, for which the 
latter assumption is dispensed with(?’). 

In the present form of the theorem, we have a preassigned predicate 
(x)7,(a, a, x) and a method which, to any formal system whatsoever for this 
predicate, gives a number f for which the following is the situation. 

Suppose that the system meets the condition that the formula expressing 
the proposition (x)71(f, f, x) is provable only if that proposition is true. Then 
the proposition is true but the formula expressing it unprovable. This state- 
ment of results uses the interpretation of the formula, but if the system has 
certain ordinary deductive properties for the universal quantifier and recur- 
sive predicates, our condition on the system is guaranteed by the metamathe- 
matical one of consistency. 

If the system contains also a formula expressing the negation of 


(*7) Rosser [1]. 


64 S. C. KLEENE [January 


(x)T,(f, f, x), and if the system meets the further condition that this formula 
is provable only if true, then this formula cannot be provable, and we have 
a formally undecidable proposition. The further condition, if the system has 
ordinary deductive properties, is guaranteed by the metamathematical one 
of w-consistency. 

Moreover, we can incorporate Rosser’s elimination of the hypothesis of 
w-consistency into the present treatment. To do so, we replace the predicate 
(Ex)R(a, x) for the application of Theorem II by (Ex)[R(a, x) & (y) ly<x 
—3(a, y)]] where (Ex)S(a, y) is the predicate expressing the provability of 
the negation of (x)Ti(a, a, x). This changes the f for the system. 

Thus we come out with the usual metamathematical results for a given 
formal system. 

For the case that a formal system is sought which should not only prove 
the true instances of P(a) but also refute the false ones, if the classical law 
of the excluded middle is applied to the propositions P(a), then the Gédel 
theorem (Theorem VIII) comes under the Church theorem (Theorem VII). 
For had we completeness with respect both to P(a) and to P(a), we could 
obtain a general recursive R(a) equivalent to the given predicate by the 
method used in proving the second part of Theorem V. Informally, this 
amounts merely to the remark that we should have the algorithm for P(a) 
which consists in searching through some list of the provable formulas until 
we encounter either the formula expressing P(a) or the formula expressing 
P(a). 

The connection between Gédel’s theorem and the paradoxes has been 
much noted. The author gave a proof of Gédel’s theorem along much the 
present lines but as a refinement of the Richard paradox rather than of the 
Epimenides(?*). That gave the undecidable propositions as values of a predi- 
cate of the more complicated form (x)(Ey)R(a, x, y) where R is general re- 
cursive. The Epimenides paradox now appears as the more basic. Currently, 
Curry has noted the same phenomenon in connection with the Kleene-Rosser 
inconsistency theorem(*). 

14. Discussion, incomplete theories. In the present form of Gédel’s theo- 
rem, several aspects are brought into the foreground which perhaps were not 
as clearly apparent in the original version. 

Not merely, to any given formal system of the type considered, can a 
proposition be formulated with respect to which that system is incomplete, 
but all these propositions can be taken as values of a preassignable elementary 
predicate, with respect to which predicate therefore no system can be com- 
plete. This depends on the thesis giving a preassignable form to the concept 
of provability in a formal system. + 


(#8) Kleene [2, XIII]. 
(#) Kleene and Rosser [1], Curry [2]. 


1943] PREDICATES AND QUANTIFIERS 65 


For the interpretation of the propositions we have required, as minimum, 
only the notions of effectively calculable predicates and of the quantifiers used 
constructively. It seems that lesser presuppositions, if one is to allow any 
mathematical infinite, are hardly conceivable. 

Beyond that the system should fulfil the structural characteristic ex- 
pressed in Thesis II, and should yield results correct under this modicum 
of interpretation, we have need of no reference whatsoever to its detailed 
constitution. | 

In particular, the nature of the intuitive evidence for the deductive proc- 
esses which are formalized in the system plays no role. 

Let us imagine an omniscient number theorist, whom we should expect, 
through his ability to see infinitely many facts at once, to be able to frame 
much stronger systems than any we could devise. Any correct system which 
he could reveal to us, telling us how it works without telling us why, would 
be equally subject to the Gédel incompleteness. 

It is impossible to confine the intuitive mathematics of elementary propo- 
sitions about integers to the extent that all the true theorems will follow from 
explicitly stated axioms by explicitly stated rules of inference, simply because 
the complexity of the predicates soon exceeds the limited form representing 
the concept of provability in a stated formal system. 

We selected as the objective in constructing a formal deductive system 
that what constitutes proof should be made explicit in the sense that a pro- 
posed proof could be effectively checked, and either declared formally correct 
or declared formally incorrect. 

Let us for the moment entertain a weaker conception of a formal system, 
under which, if we should happen to discover a correct proof of a proposition 
or be presented with one, then we could check it and recognize its formal cor- 
rectness, but if we should have before us an alleged proof which is not correct, 
then we might not be able definitely to locate the formal fallacy. In other 
words, under this conception a system possesses a process for checking, which 
terminates in the affirmative case, but need not in the negative. Then the 
concept of provability would have the form (Ex)P+t(a, x) where P* is the 
positive completion of a partial recursive predicate P(a, x). By Theorem VI, 
P*+(a, x) is expressible in the form (Ey)R(a, x, y) where R is general recursive. 
Then the provability concept has the form (Ex)(Ey)R(a, x, y), or by contrac- 
tion of quantifiers (Ex) R(a, (x):, (x)2). This is of the form (Ex)R(a, x) where 
R is general recursive. Thus the concept of provability has the usual form, 
and Gédei’s theorem applies as before. If we take a new concept of proof 
based on R(a, x), that is, if we redesignate the steps in the checking process as 
the formal proof steps, the concept of proof assumes the usual form. 

We gave no attention, when we formulated the objectives both of an algo- 
rithmic and of a formal deductive theory, to the nature of the evidence for 
the correctness of the theory, or to various other practical considerations, 


q 
i 


66 S.C. KLEENE [January 


simply because the crude structural objectives suffice to entail the correspond- 
ing incompleteness theorems. In this connection, it may be of some interest 
to give the corresponding definitions, although these may not take into ac- 
count all the desiderata, for the case of incomplete theories of the two sorts. 
We shall state these for predicates of m variables ai, - - - , @,, as we could also 
have done for the case of the complete theories. 

To give an algorithmic theory (not necessarily complete) for a predicate 
P(ai, - - + , @,) is to give a general recursive function (a, - - - , @,), taking 
only 0, 1, and 2 as values, such that 


(36) 


@n) = 1— Play, --- , ay). 


The algorithm always terminates, but if (ai, - - -, @,) has the value 2 we 
can draw no conclusion about P(ai, - - - , dn). 

To give a formal deductive theory (not necessarily complete) for a predicate 
P(ai, , @,) is to give a general recursive predicate R(ai, - , x) such 
that 


(37) , Gn, *) P(ai, , Gn). 


In words, to give a formal deductive theory for a predicate P(a1, - - - , dn) is 
to find a sufficient cordition for it of the form (Ex)R(ai, - - + , ds, x) where R 
is general recursive. Here, according to circumstances, the sufficiency may be 
established from a wider context, or it may be a matter of postulation (hy- 
pothesis), or of conviction (belief). 

From the present standpoint, the setting up of this sufficient condition 
is the essential accomplishment in the establishment of a so-called metatheory 
(in the constructive sense) for the body of propositions taken as the values of 
a predicate. We note that this may be accomplished without necessarily going 
through the process of setting up a formal object language, from which R is 
obtainable by subsequent arithmetization, although as remarked above, we 
can always set up the object language, if we have the R by some other means. 

In the view of the present writer, the interesting variations of formal tech- 
nique recently considered by Curry have the above as their common feature 
with formalization of the more usual sort(*°). This is stated in our terminol- 
ogy, Curry’s use of the terms “meta” and “recursive” being different. He 
gives examples of “formal systems,” in connection with which he introduces 
some predicates by what he calls “recursive definitions,” but what we should 
prefer to call “inductive definitions.” This important type of definition, under 
suitable precise delimitation so that the individual clauses are construc- 
tive, can be shown to lead always to predicates expressible in the form 
(Ex)R(a1, - - +, @n, x) where R is recursive in our sense. Indeed, this fact 


(®) Curry [1]. 


1943] PREDICATES AND QUANTIFIERS 67 


can be recognized by substantially the method indicated above for the case 
of the inductive definition establishing the notion of provability for a formal 
system of the usual sort. 

Conversely, given any predicate expressible in the form (Ex)R(a,---, 
a,, x) where R is recursive, we can set up an inductive definition for it. 

15. Ordinal logics. In ordinal logics, studied by Turing(*), the require- 
ment of effectiveness for the steps of deduction is relaxed to allow dependence 
on a number (or A-formula) which represents an ordinal in the Church- 
Kleene theory of constructive ordinals(*). A presumptive proof in an ordinal 
logic cannot in general be checked objectively, since the proof character de- 
pends on the number which occupies the role of a Church-Kleene representa- 
tive of an ordinal actually being such, for which there is no effective criterion. 
Nevertheless it was hoped that ordinal logics could be used to give complete 
orderings (with repetitions) of the true propositions of certain forms into 
transfinite series, by means of the ordinals represented in the proofs, in such 
a way that the proving of a proposition in the ordinal logic (and therewith 
the determination of a position for it in the series) would somehow make it 
easier to recognize the truth of the proposition. 

Turing obtained a number of interesting results, largely outside the scope 
of this article, but among them the following. There are ordinal logics which 
are complete for the theory of a predicate of the form (x)(Ey)R(a, x, y) where 
Ris general recursive ; however, for the example of such a logic which is given, 
its use would afford no theoretic gain, since the recognition that the number 
which plays the role of ordinal representative in a proof of the logic is actually 
such comes to the same as the direct recognition of the truth of the proposi- 
tion proved. 

Now let us approach the topic by inquiring whether, and if so where, the 
property of being provable in a given ordinal logic is located in the scale of 
predicate forms of Theorem II. First, it turns out that the property of a 
number a of being the representative of an ordinal is expressible in the form 
(x)(Ey)R(a, x, y) where R is recursive(*). Now we may use the definition of 
ordinal logic in terms of \-conversion, or we may take the notion in general 

‘terms as described above, and state the thesis that for a given predicate P(a) 
and given ordinal logic the provability of P(a) is expressible in the form 
(Ea)(Ex)R(a, a, x) where a ranges over the ordinal representatives and R 
is general recursive. In either case, it then follows that the provability of P(a) 
is expressible in the form (Ex)(y)(Ez)R(a, x, y, z) where R is general recursive. 
Conversely, to any predicate of the latter form, we can find an ordinal logic 


(*) Turing [2]. Turing gave a somewhat restricted definition of “ordinal logic” in terms of 
the theory of A-conversion for predicates expressible in the form (x)(Ey)R(a, x, y) where R is 
recursive. 

(#) Church and Kleene [1], Church [2], Kleene [4]. 

(#) Kleene [5]. 


68 Ss. C. KLEENE [January 


in the more general sense such that provability in the logic expresses the predi- 
cate. Hence there is a complete ordinal logic to each predicate of each of the 


forms 
(Ex)R(a,x) (x)(Ey)R(a,x,y)  (Ex)(y)(Es)R(a, x, y, 2) 
(x)R(a,x)  (Ex)(y)R(a, =, y) 3 


where R is general recursive, but by Theorem II, classically there are predi- 
cates of the form (x)(Ey)(z)R(a, x, y, 2) and of each of the forms with more 
quantifiers, or classically and intuitionistically of the form (Ex)(y)(Ez)R(a, x, 
y, 2) and of the negation of each of the forms with more quantifiers, for which 
no complete ordinal logic is possible. Specifically: 

THEOREM IX. There is no complete ordinal logic for the predicate 
(Ex) (y)(Ez)T;(a, a,x, 2). 


Ordinal logics form a class of examples of the systems of propositions 
which have recently come under discussion, in which more or less is retained 
of the ordering of propositions in deductive reasoning, but with an extension 
into the transfinite, or a sacrifice of constructiveness in individual steps. These 
may be called “non-constructive logics,” in contrast to the formal deductive 
systems in the sense of §§13-14 which are “constructive logics.” In general, 
the usefulness of a non-constructive logic may be considered to depend on the 
degree to which the statement of the non-constructive proof criterion is re- 
moved from the direct statement of the propositions. 

Theorem IX is a “Gédel theorem” for the ordinal logics. The ordinal logics 
were at least conceived with somewhat of a constructive bias. Rosser has 
shown how Gédel theorems arise on going very far in the direction of non- 
constructiveness(*), and Tarski has stated the Gédel argument for systems 
of sentences in general(*). Incidental of Rosser’s results for finite numbers of 
applications of the Hilbert “rule of infinite induction,” also called “Carnap’s 
rule,” can easily be inferred from Theorem II, through the obvious corre- 
spondence of an application of this rule to a universal quantifier in the proof 
concept. However, the proof concepts for non-constructive logics soon outrun 
the scale of predicate forms of Theorem II. This appears to be the case even 
for the extension to protosyntactical definability given by Quine(*). If one 
is going very far in the direction of non-constructiveness, and is not interested 
in considerations of the sort emphasized in §§12—14, there is no advantage in 
starting from the theory of recursive functions. But the more general results 
do not detract from the special significance which attaches to the Gédel theo- 
rems associated with provability criteria of the forms R(a) and (Ex)R(a, x) 


R(a) 


(*) Rosser [2]. 
(%) Tarski [2]. 
(*) Quine [1]. 


1943} PREDICATES AND QUANTIFIERS 69 


where R is general recursive, that is, Church’s theorem and Gédel’s theorem, 
for which forms only it is true that a given proof is a finite object. 

16. Constructive existence proofs. A proof of an existential proposition 
(Ey)A(y) is acceptable to an intuitionist, only if in the course of the proof 
there is given a y such that A (y) holds, or at least a method by which such a y 
could be constructed. Consider the case that A(y) depends on other variables. 
Say that there is one of these, x, and rewrite the proposition as (x)(Ey)A (x, y). 
The proposition asserts the existence of a y to each of the infinitely many 
values of x. In this case, the only way in which the constructivist demand 
could in general be met would be by giving the y as an effectively calculable 
function of x, that is, by giving the function. According to Thesis I, this func- 
tion would have to be general recursive. Hence we propose the following thesis 
(and likewise for variables x1, - - - , X,): 


TueEsts III. A proposition of the form (x)(Ey)A(x, y) containing no free 
variables is provable constructively, only if there is a general recursive function 
(x) such that (x)A (x, o(x)). 


When such a ¢ exists, we shall say that (x)(Ey)A(x, y) is recursively ful- 
fillable(*"). 

This thesis expresses what seems to be demanded from the standpoint of 
the intuitionists. Whether such explicit rules of proof as they have stated do 
conform to the thesis is a further question which will be considered else- 
where(**). However, in its aspect as restriction on all intuitionistic existence 
proofs, the possibilities for which, as we know by Theorem VIII, transcend 
the limitations of any preassignable formal system, the thesis is more general 
than a metamathematical result concerning a given system. 

We now examine the notion of.recursive fulfillability as it applies to the 
values of a given predicate of the form (x)(Ey)(z)R(a, x, y, 2) where R is gen- 
eral recursive. Select any fixed value of a. Given a recursive ¢ which fulfils the 
corresponding proposition, by Theorem IV there is a number e such that 
(x)(Ey)Ti(e, x, y) and (x)(y)[Ti(e, x, y)—>(z)R(a, x, U(y), 2)]. Conversely, if 
such an é¢ exists, the proposition is fulfilled by the general recursive function 
U(uyTi(e, x, y)). Thus 


(Ee) { (x)(Ey)Ti(e, x, y) & (x)(y) (Tile, x, — (2)R(a, x, U(y), 2)]} 


is a necessary and sufficient condition for recursive fulfillability. When the 
quantifiers are suitably brought to the front and contracted, this assumes the 
form (Ex)(y)(Ez)R(a, x, y, 2) with another general recursive R depending on 


the original R. 
By Theorem II, classically, there is a predicate of the original form 


(*7) A further analysis of the implications of constructive provability is given in Kleene [6]. 
(#8) Nelson [1]. 


70 S. C. KLEENE [January 


(x)(Ey)(z)R(a, x, y, 2) which is not expressible in this form (Ex)(y)(Ez)R(a,x,¥,2), 
in which the condition of its recursive fulfillability is expressible. 

Using the example of such a predicate given in the proof of Theorem II, 
we have then 


(38)  {(x)(Ey)(2)Ts(a, a, x, y, 2) rec. fulf.} = (Ex)(y)(Es)R(a, x, y, 2) 

for a certain general recursive R. Substituting the number f of (14) for a in 
(14) and (38), 

(39) (Ex)(y)(Ez)R(f, x, y, 2) = (Ex)(y)(Es)Tsf, f, 2), 
(40) { (x)(Ey)(2) f, =, y, 2) rec. fulf.} = (Ex)(y)(Es)R(f, x, y, 2). 

By the definition of recursive fulfillability, 

(41) {(x)(Ey)(2)Ts(/, f, x, y, 2) rec. fulf.} (x)(Ey)(2) f, x, 2)- 


Suppose that (x)(Ey)(z)T3(f, f, x, y, 2) were recursively fulfillable. We 
could then conclude by (40) and (39), (Ex)(y)(Ez)T3(f, f, x, y, 2), and by (41), 
(x)(Ey)(z)Ts(f, f, x, y, 2). These results are incompatible. Therefore by reduc- 
tio ad absurdum, (x)(Ey)(z)T3(f, f, x, y, 2) is not recursively fulfillable, and 
hence by Thesis III not constructively provable. 

Now by (40) and (39), we have (Ex)(y)(Ez)Ts(f, f, x, y, 2); and thence 
classically we can proceed to (x)(Ey)(z)Ts(f, f, x, y, 2). 


THEOREM X. For a certain number f, the proposition (x)(Ey)(z)Ts(f,f, x, y, 2) 
is true classically, but not constructively provable. 


Notice that we have here a fixed unprovable proposition for all construc- 
tive methods of reasoning, whereas in the preceding incompleteness theorems. 
we had only an infinite class of propositions, some of which must be unprov- 
able in a given theory. 

Intuitionistic number theory has been presented as a subsystem of the 
classical, so that the intuitionistic results hold classically, though many classi- 
cal results are not asserted intuitionistically. The possibility now appears of 
extending intuitionistic number theory by incorporating Thesis III in the 
form 


(x)(Ey)A(x, y) > { for some general recursive ¢, (x)A(x, $(x)) } j 


so that the two number theories should diverge, with the proposition of 
Theorem X true classically, and its negation true intuitionistically(*). 
For the classical proof, an application of 


(x)A(x) (Ex) A(x) 


suffices as the sole non-intuitionistic step; therewith that law of logic would 


(**) This is perhaps hinted in Church [1, first half of p. 363]. 


1943] PREDICATES AND QUANTIFIERS 71 


be refuted intuitionistically, for a certain A. Hitherto the intuitionistic _re- 
futations of laws of the classical predicate calculus have depended on the 
interpretation of the quantifiers in intuitionistic set theory(*). 

The result of Theorem X, with another proposition as example, can be 
reached as follows. Consider the proposition, 


(x)(Ey) { [(Ez)Ti(x, x, 2) & y = 0) V [(2)Ti(x, x, 2) & y = 1]}. 


This holds classically, by application of the law of the excluded middle in the 
form 


(x) {(Ez)A(x, 2) V (2)A(x, 2)}, 
or the form 
(x)(A(x) V A(x)), 


from which the other follows by substituting (Ez)A (x, z) for A(x). But it is 
not recursively fulfillable, since it can be fulfilled only by the representing 
function of the predicate (Ez)7Ti(x, x, 2), which, as we saw in the proof of 
Theorem II, is non-recursive. 

17. Non-elementary predicates. The elementary predicates are enumer- 
able. By Cantor’s methods, there are therefore non-elementary number-theo- 
retic predicates. However let us ask what form of definition would suffice to 
give such a predicate. Under classical interpretations, the enumeration of 
predicate forms given in Theorem II for m variables suffices for the expression 
of every elementary predicate of m variables. By defining relations of the form 
shown in the next theorem, we can introduce a predicate M(a, k) so that it 
depends for different values of k on different numbers of alternating quanti- 
fiers. On the basis of Theorem II, it is possible to do this in such a way that 
the predicate will be expressible in none of the forms of Theorem II. 


THEOREM XI. Classically, there is a non-elementary predicate M(a, k) de- 
finable by relations of the form 
M(a, 0) = R(a) 
M(a, 2k + 1) = (Ex)M(9(a, x), 2k) 
M(a, 2k + 2) = (x)M(¢(a, x), 2k + 1) 
where R and ¢ are primitive recursive. 


We are dealing here with essentially the same fact which Hilbert-Bernays 
discover by setting up a truth definition for their formal system (Z)(*'). 

The system (Z) has as primitive terms only ’, +, -, = and the logical 
operations. The predicates expressible in these terms are elementary. Con- 


() Heyting [1, p. 65]. 
(“) Hilbert and Bernays [1, pp. 328-340]. 


72 S. C. KLEENE : [January 


versely, using Theorem IV and Gédel’s reduction of primitive recursive func- 
tions to these terms(“), every elementary predicate is expressible in (Z). 
The Hilbert-Bernays result is an application to (Z) of Tarski’s theorem 
on the truth concept(“), with the determination of a particular form of rela- 
tions which give the truth definition for (Z). If (Z) is consistent, a foi mal 
proof that the relations do define a predicate is beyond the resources of (Z). 


BIBLIOGRAPHY 


Atonzo CHURCH 
1. An unsolvable problem of elementary number theory, Amer. J. Math. vol. 58 (1936) pp. 
345-363. 
2. The constructive second number class, Bull. Amer. Math. Soc. vol. 44 (1938) pp. 224-232. 
Atonzo CuHuRCH AND S. C. KLEENE 
1. Formal definitions in the theory of ordinal numbers, Fund. Math. vol. 28 (1936) pp. 11-21. 
ALoNnzo CHURCH AND BARKLEY ROssER 
1. Some properties of conversion, Trans. Amer. Math. Soc. vol. 39 (1936) pp. 472-482. 
H. B. Curry 
1. Some aspects of the problem of mathematical rigor, Bull. Amer. Math. Soc. vol. 47 (1941) 
pp. 221-241. 
2. The inconsistency of certain formal logics, J. Symbolic Logic vol. 7 (1942) pp. 115-117. 
Kurt G6DEL 
1. Uber formal unentscheidbare Sétze der Principia Mathematica und verwandter Systeme I, 
Monatshefte fiir Mathematik und Physik vol. 38 (1931) pp. 173-198. 
2. On undecidable propositions of formal mathematical systems, notes of lectures at the In- 
stitute for Advanced Study, 1934. 
Davip HILBERT AND PAUL BERNAYS 
1. Grundlagen der Mathematik, vol. 2, Berlin, Springer, 1939. 
AREND HEYTING 
1. Die formalen Regeln der intuitionistischen Mathematik, Preuss. Akad. Wiss. Sitzungsber, 
Phys.-math. KI. 1930, pp. 57-71, 158-169. 
S. C. KLEENE 
1. A theory of positive integers in formal logic, Amer. J. Math. vol. 57 (1935) pp. 153-173, 
219-244. 
2. General recursive functions of natural numbers, Math. Ann. vol. 112 (1936) pp. 727-742. 
3. A note on recursive functions, Bull. Amer. Math. Soc. vol. 42 (1936) pp. 544-546. 
4. On notation for ordinal numbers, J. Symbolic Logic vol. 3 (1938) pp. 150-155. 
5. On the forms of the predicates in the theory of constructive ordinals, to appear in Amer. J. 
Math. (Bull. Amer. Math. Soc. abstract 48-5-215). 
. On the interpretation of intuitionistic number theory, Bull. Amer. Math. Soc. abstract 
48-1-85. 
S. C. KLEENE AND BARKLEY RossER 
1. The inconsistency of certain formal logics, Ann. of Math. (2) vol. 36 (1935) pp. 630-636. 
Davip NELSON 
1. Recursive functions and intuitionistic number theory, under preparation. 


E. L. Post 
1. Finite combinatory processes—formulation 1, J. Symbolic Logic vol. 1 (1936) pp. 103-105. 


(*) Gédel [1, Theorem VII]. See Kleene [3 (erratum: p. 544, line 11, “of” should be at the 


end of the line) }. 
(*) Tarski [1]. 


1943] PREDICATES AND QUANTIFIERS 


W. V. QuINE 
1. Mathematical logic, New York, Norton, 1940. 


BARKLEY ROssER 
1. Extensions of some theorems of Gédel and Church, J. Symbolic Logic vol. 1 (1936) pp. 87- 


91. 
2. Gédel theorems for non-constructive logics, ibid. vol. 2 (1937) pp. 129-137. 


ALFRED TARSKI 
1. Der Wahrheitsbegriff in den formalisierten Sprachen, Studia Philosophica vol. 1 (1936) 


pp. 261-405. (Original in Polish, 1933.) 
2. On undecidable statements in enlarged systems of logic and the concept of truth, J. Symbolic 
Logic vol. 4 (1939) pp. 105-112. 


A. M. TuRING 
1. On computable numbers, with an application to the Entscheidungsproblem, Proc. London 


Math. Soc. (2) vol. 42 (1937) pp. 230-265. 
2. Systems of logic based on ordinals, ibid. vol. 45 (1939) pp. 161-228. 


AMHERST COLLEGE, 
AMHERsT, Mass. 


73 - 


BEZOUT’S THEOREM AND ALGEBRAIC 
DIFFERENTIAL EQUATIONS(?) 


BY 
J. F. RITT 


The problem of determining by inspection the number of solutions of a 
system of algebraic equations finds its solution in Bézout’s theorem and in 
important complements to that theorem obtained in recent years by van der 
Waerden(?). The corresponding problem for a system of algebraic differential 
equations is that of determining bounds for the numbers of arbitrary con- 
stants which enter into the irreducible manifolds which the system yields. 
This problem has been considered by us in two previous papers(*). 

In the present paper, we study the intersections of the general solutions 
of two algebraically irreducible forms A and B in the unknowns y and z. The 
statement of our results depends on some definitions which we proceed to give. 

Let F be a form in several unknowns. F has an order in each of its un- 
knowns. The maximum of these orders will be called the order of F. 

Let 2 be a non-trivial prime ideal of forms in any unknowns. & has a cer- 
tain number g20 of arbitrary unknowns. We shall call g the dimension of 
the manifold of 2. 

By the order of an irreducible manifold I? of dimension zero, we mean the 
order of any resolvent for the prime ideal of which J? is the manifold. 

An irreducible manifold 2 which is part of a manifold M will be called an 
irreducible component (often simply component) of M if M contains no irre- 
ducible manifold of which ® is a proper part(‘). 

Let us return now to A and B as above, which we suppose to have the 
respective orders m and n. Let the general solutions of A and B have a non- 
vacuous intersection J. It is a most natural conjecture that, if Mt has one or 
more irreducible components of dimension zero, their orders do not exceed 
m-+-n. This conjecture is verified below for the cases in which neither of m 
and n exceeds unity. It was not without surprise that we found our conjecture 
to lapse into default for larger values of the orders. We shall show how to 
construct, for every n 24, a form of order m whose general solution intersects 


Presented to the Society, October 31, 1942; received by the editors March 23, 1942. 

(*) For indications in regard to the general theory to which this paper attaches, one may 
consult the author’s paper in the second volume of the Semicentennial Publications of the Amer- 
ican Mathematical Society. 

(?) van der Waerden, Einfuhrung in die algebraische Geometrie, Berlin, 1939, chap. 6. 

(*) Systems of algebraic differential equations, Ann. of Math. (2) vol. 36 (1935) p. 293; 
Jacobi's problem on the order of a system of differential equations, ibid. p. 303. The second of these 
papers will be denoted below by J. 

(*) In other words, 9% is essential in P. 


74 


BEZOUT’S THEOREM 75 


the manifold of y=0 in the manifold of y=0, ze,_3=0, a manifold of order 
2n—3. 


FORMS OF ORDERS NOT EXCEEDING UNITY 


1. We prove the statement made, for the cases with m<1, n <1, in the 
introduction. 

When m=n=0, there is nothing to prove. 

Let m=0, n=1. Let M be a component of M of dimension zero. We con- 
sider first the intersection Dt’ of the complete manifolds of A and B. Every 
component of 9’ of dimension zero has an order not exceeding unity(5). Then, 
by Gourin’s theorem(®), if 2 is not contained in a component of M’ of dimen- 
sion unity, 9 is of order not greater than unity. 

We have now to consider the case in which 9 is contained in a component 
M’’ of M’ of dimension unity. M’’ is the general solution of a form C. Be- 
cause A, which is of order zero, holds J2’’, C must be of order zero; this im- 
plies that M’’ is the manifold of A. Then P’’ must be a component of the 
manifold of B. Otherwise M’’ would be contained in the general solution of B 
and 9 would not be a component of M(’). 

We suppose, as we may, that A involves z effectively. As Dt’’ is a proper 
part of the manifold of B, B must be of order unity in z. Let S=0A/0z. Then 
some has a representation 


Here A; is the derivative of A and, for every i, p:+9;>p. The orders of the C; 
in z and in y do not exceed 0 and 1, respectively, and no C; is divisible by A. 
As ® is in the intersection of t’’ and the general solution of B, Cy must 
hold 2(*). The manifold of the system Co, A has components which are all 
of dimension zero and none of order greater than unity(*). This disposes of 
the case of m=0, n=1. 

Now let m=n=1. We use Jt and MP’ as above. We take up immediately 
the case in which ® is contained in a component M’’ of Mt’ of dimension 
unity; when 9 is not so contained, its order cannot exceed 2(*). As N is a 
component of M, M’’ is not part of Mt. Let, then, Mt’’ fail to be contained 
in the general solution of B. Then some other component of the manifold 
of B, indeed the manifold of a form of order zero, contains Pt’’ and is thus 
identical with M’’. By the case of m=0, n=1, the components of the inter- 


(*) This is proved in J. 

(*) Bull..Amer. Math. Soc. vol. 39 (1933) p. 593. 

(7) The components of B other than its general solution are manifolds of forms of order 
zero. See On certain points in the theory of algebraic differential equations, Amer. J. Math. vol. 60 
(1938) pp. 1-43, §30. This paper will be denoted by C. P. 

(*) C. P. §31. 

(*) By J. 


76 J. F. RITT ; [January 


section of I’’ with the general solution of B are of dimension zero and of 
order at most unity. This completes the proof. 
A FORM OF ORDER FOUR 


2. In what follows, K, will represent, for any form K, the derivative of K. 
We let 


(1) A = yi — 239", 

(2) B= A‘—¥;, 

(3) C = ysAi — 294A, 

(4) F=uB—yC 


We use the field of all constants. Let us see first that F is algebraically 
irreducible. If we consider the equation F=0 as an algebraic equation for ya, 
we secure a function y, of two branches. Thus, if F were factorable, it would 
have a factor of positive degree free of y4. Such a factor would have to be a 
factor of y°A*. As F is not divisible by y or by A, F is algebraically irreduc- 
ible. 

Let us determine now the components of the manifold of F other than the 
general solution. 

Let 9 be such a component. As 0F/dy,=4y*AC, RM must be held by yC 
or by A. Suppose that A holds ¥t. By (3) and (4), ys holds MR. In every case 
then, B holds %. 

Now B is the product of the four forms 


(5) EY = — js, j= 41,4 (-1)", 
each of which is algebraically irreducible. For what follows, it is important 
to know that the manifold of each E“ is irreducible. From the manner in 
which 2s figures in (5), one sees that a component of the manifold of an E“ 
distinct from the general solution is held by y. Such a component, being of 
dimension unity(!*), must be the manifold of y. But the low power theorem("*) 
shows that the manifold of y is not a component. This proves the irreducibil- 
ity of the manifolds of the E. 
We have, for every j, 
C= 

Referring to (4) and applying the low power theorem, we find that the mani- 
fold of each E™ is a component of the manifold of F(!*). 


C. P. §1. 

(1) So we designate the theorem of C. P. §29. 

(2) Technically, in applying the low power theorem, we have to multiply F by y} and to 
effect a reduction. Actually, on considering the proof of the low power theorem, one sees that 
one may dispense with this process of preparation. For instance, if one replaces y, in the coeffi- 


1943] BEZOUT’S THEOREM 77 


Thus the manifold of F has five components, the general solution and the 
manifolds of the E™. 

3. In what follows it will be proved that the intersection of the general 
solution of F with the manifold of y=0, is the manifold of the system y=0, 
zs=0. The latter manifold is of dimension zero and of order 5. The proof 
employs some general results, bearing on ideals of differential polynomials, 
which will now be set forth. 


DEDUCTIONS FROM LEVI’s THEOREM ON POWER PRODUCTS 


4, In what follows P is a power product in y and derivatives of y, d the 
degree of P, w the weight of P and # a positive integer. 
Modifying a theorem due to Howard Levi(*), we derive the following 
result : If 
—1 


p 
(6) +(@- 


(p 


then("*) 
P=0, [y*]. 
We suppose, as we may, that p>1. Let (6) be satisfied. Then 
(p — 1)w < d*? — d(p — 1). 


d=a(p—1) +5 
where a and 3 are integers such that 220, 0<dbSp-—1. As b(p—1—6) 20, 
(7) gives 
(9) (p — 1)w < d®—d(p — 1) + (p — 1 — 
We replace d in (9) by its expression in (8), finding that 
(10) w < a(a — 1)(p — 1) + 2a. 
By Levi's theorem, P=0, [y?]. 


We denote by 5(p, w) the second member of (6). 
5. Representing y? by u, we prove the following result, which holds for 


any power product P as in §4 and for any values of d, w, p. 
P has a representation as a homogeneous polynomial in u and derivatives of u, 


cient of E® in C by a new unknown u, E“ is seen immediately to furnish a component of the 
manifold of the form in u, y, s into which F is converted. 

(#) Trans. Amer. Math. Soc. vol. 51 (1942) p. 545. 

(“) The notation, as regards congruences, is due to E. R. Kolchin, Ann. of Math. (2) vol. 42 
(1941) p. 740. 


Let 


78 J. F. RITT . [January 


whose coefficients are homogeneous polynomials("*) in y and derivatives of y of a 
common degree not greater than 5(p, w). 

If d<5(p, w), P itself is the representation sought. Otherwise, by §4, P is 
a linear combination of the u;, with coefficients all of degree d—p and none 
of weight greater than w. If d—p<6(p, w), we have the desired representa- 
tion. Otherwise the coefficients of the u; will be in [u]. Continuing in this 
manner, we find P expressed as in our statement. 


MULTIPLIERS OF A FORM 


6. Let = be an ideal (differential) of forms in y and z; M a form in y and z; 
a@ a non-negative number. We shall say that M admits a as a multiplier with 
respect to = if for every €>0 there exists an integer mo(€) such that, for every 


n>no(é), 


where P is a form depending on m which, arranged as a polynomial in the 
y.(**), contains no term of degree less than n(a—e). P may be zero. If ais a 
multiplier for M and if 0S <a, y is also a multiplier. 

We prove the following properties of multipliers: 

(a) Let M and N admit a and 8, respectively, as multipliers with respect 
to 2. Let y=min (a, 8). Then M+WN admits y as a multiplier. 

(b) For M and N as in (a), MN admits a+ as a multiplier. 

(c) Let M?”, where p is a positive integer, admit a as a multiplier. Then 
admits a/p. 

(d) Let M admit @ as a multiplier. Then M;, the derivative of M, also 
admits a. 

(e) If M=N, [2], M and N admit the same multipliers. 

Proving (a), we take an ¢>0. Let mo(€/2) serve as above for both M and N 
with respect to €/2. We consider (M+ N)* for any n21. Let R= M*N°* where 
a+b=n. If a and b both exceed mo(e/2), we have R=P, [2] where no term 
of P is of degree less than 


a(a — «/2) + — ¢/2), 
which quantity is not less than n(y—€/2). If b<mo(€/2) <a, we have R=P, 
[2] with no term of P of degree less than 
[n = no(e/2) |(a — ¢/2). 


This last quantity, if is large in comparison with mo(€/2), exceeds n(a—e). 
The truth of (a) is now clear. 


(44) The coefficients of the polynomials in the are rational numbers. 
(*) When P is thus arranged, its coefficients will be forms in s. The definition of multiplier 


gives a special role to y. 


1943] BEZOUT’S THEOREM 79 


The proofs of (b), (c) and (e) are trivial. 

Proving (d), we take an e€>0 and, relative to M, an mo(¢/2). Let m be a 
fixed integer which exceeds mo(e/2). We consider an nm >0 and use 6(m, n) as 
in §4. Then M7 is a polynomial in M™” and its derivatives with coefficients 
which are forms in M of degree not greater than 6(m, nm). In this expression 
for M?, every power product in M™ and its derivatives is of degree not less 
than 


(11) q = [n — 8(m, n)]/m. 


Now, if » is large, 5(m, m) as one sees from (6), is small in comparison with n, 
so that g is only slightly less than »/m. Each power product in M™ and its 
derivatives is congruent to a form whose terms have degrees in the y; not less 
than gm(a—e/2). If n is large, this last quantity exceeds n(a—e), q.e.d. 


THE FORM F. FIRST OPERATION 


7. We return to F of §2, denoting the general solution of F by I. We 
show now that a solution in Jt with y =0 satisfies 2; =0. Later, we shall prove 
that every z with z;=0 is admissible. 

We determine first a form G which holds J? but none of the other four 


components. 
We have by (2) and (3), 


(12) AB, — 4A,B = 


Thus by (4) (first representation of F), we have, when F=0, 
(13) 4y;B" = y (AB, — 4A,B). 
Again, letting K=y°C, we have by (4), when F=0, the relation BY?=K. 
Thus, for F=0, 
(14) B-12B, = 2K. 
Substituting into (13) the expression which (14) furnishes for B,, and simplify- 
ing, we find for F=0, B 0, 
(15), +L =0 
where 
(16) = — 4y'yAKi + y'A'Ki — 
We designate the first member of (15) by G. Then G holds M. 
8. In what follows, all multipliers will operate with respect to [F, G], the 
differential ideal generated by F and G. 
In (4), y§ and y*C? contain no terms of degree less than 8 in the y;. Thus 


A‘ admits 8 as a multiplier so that, by (c) of §6, A admits 2. Now zy? admits 
2. By (a) of §6, y: admits 2. Then, by (d), every y; with 121 admits 2. From 


80 J. F. RITT ; [January 


(3), using (a), (b), (d), we find that C admits 4. Referring to (4) and using (e), 
we see now that A‘ admits 14, so that A admits 3. By (3), now, C admits 5 
and we find from (4) that A admits 4. We return to (3) and see that C admits 
6. Also by (4), B admits 18. Finally K of §7 admits 9. 

By (16), ZL admits 30. By (15), ys admits 15/7. Now ye—zay?—2zsyy1, 
which is A;, admits 4. As y; admits 2, ye — zy? admits 3. Then —2sy? — 2z4yy1 
admits 3, so that y3—zsy? admits 3. As ys admits 15/7, zsy? admits 15/7. 

We infer that [F, G] contains a form of the type (zsy*)"+ M where every 
term of M is of degree greater than 2m in the y;. It follows from the low power 
theorem that a solution in J2 cannot have y=0 unless z5=0. 


SECOND OPERATION 


9. Let a be any polynomial of effective degree 4. We shall prove that M 
contains y =0, =a. This will imply that every z for which zs =0 appears in M 
with y=0 and our investigation of F will be completed. 

Representing by c an arbitrary constant and by v a new unknown, we put 
z=a in F and then make in F the substitution(!”) 


6 
(17) ca: 


We represent by A’, Ai, B’, C’, F’ the expressions into which A, Ai, B, C, F 
are, respectively, transformed when z is replaced by @ and y by the second 


member of (17). 
We find from (17) 


(18) A’ = + 
with P a polynomial in x, c, v. Then we may write 
(19) Ai = cv, + c0. 


In (17), the coefficient of c? is of the second degree in x; that of c is of 
the fourth degree. We have thus 


(20) y= cB+---; 
with 6 of the first degree and y constant. By (18), (19), (20), 

C’ = — 2y0:) + 
with R a polynomial in x, c and the v; with 7 <4. We find thus 
(21) F’ = — — (B02 — 2y0:)") +c 
with T of the type of R. 


Subscripts of indicate differentiation. 


1943] BEZOUT’S THEOREM 81 


10. Let V represent the coefficient of c* in F’. As 80, the differential 
equation V =0 is effectively of the second order. Let then v=€ be a solution 
of V=0 with 


(22) ti — #0. 
We wish to show that F’ is formally annulled by a series 
(23) v= E+ 


of the following description. The p; are positive rational numbers, with a com- 
mon denominator, which increase with their subscripts. The ¢; are analytic 
functions of x, all analytic at some point at which £ is analytic(**). 

It will suffice to show that G= F’/c™ is annulled by a series (23). If G 
vanishes identically in x and c for v=£, then v=¢£ is an acceptable series (23). 
In what follows, we assume that such vanishing does not occur. 

Introducing a new unknown 1, we put, in G, v=£+4,: Then G goes over 
into an expression H’ in x, c and 1, 


(24) H! = a'(c) + 

Here > contains the terms of H’ which are not free of the 1; and, in > a 
4 ranges from unity to some positive integer. As to a’(c) and the 5/ (c), they 
are polynomials in c with analytic functions of x for coefficients. Because £ 
does not annul G identically, a’(c) is not identically zero. On the other hand, 
because G vanishes for v=, c=0, the lowest power of c in a’(c) is positive. 


Because the bracketed terms in (21) contribute effectively to > in (24), cer- 
tain of the b/ (c) contain terms of power zero in c. 

Let a’ be the least exponent of ¢ in a’ and o/ the least exponent of c in b/. 
Let 


—a 


where i has the range which it has in }>. As o’>0 and certain o/ equal 0, 
p2>0. 

We now take over §§12-16 of our paper On the singular solutions of alge- 
braic differential equations(), putting m=4 in that discussion. We are 
brought to the series (23) for v. 

11. We have shown, all in all, that F, for z=a, is annulled by a series 


(25) 


where the unwritten terms have rational exponents greater than 6. The series 
(25) does not annul B for z=a. Indeed, 


pP2 = max 


(#8) One may suppose that ¢,=£, p,=0. 
(*) Ann. of Math. (2) vol. 37 (1936) p. 541. 


J. F. RITT 


Bi 


and, because of (22), the coefficient of c* in B’ does not vanish for v=€. 

It follows that every form which holds I% vanishes for z=a and for y as 

in (25). This means that y=0, z=a is in M. 
REMARKS 

12. If in (1) to (4), we replace 23, ys, ys wherever they appear by 2,1, 
Yn—1) Yn, Tespectively, where »24, we obtain a form F with a general solu- 
tion which intersects the manifold of y =0 in that of y=0, ze,_3 =0; the proofs 
require only the slightest changes. 

In F of §2, if one replaces zs by z, one obtains a form which is of the first 
order in z and has a general solution which intersects the manifold of y=0 
in that of y=0, z2=0. This in itself is sufficiently anomalous. However, if it 
is desired to secure a form F whose order in z cannot be reduced, it suffices to 
replace ys and y4 in (2), (3), (4) by zys and its derivative, respectively. 


CoLuMBIA UNIVERSITY, 
New York, N. Y. 


THE CONTINUITY OF FUNCTIONS OF MANY VARIABLES 


BY 
RICHARD KERSHNER 


1. Introduction. It is known that a function f(x, y) of two real variables 
may be continuous with respect to each variable separately throughout a 
given region without being continuous with respect to (x, y) at all points of 
the region. In fact, W. H. and G. C. Young(') have given an example of a 
function f(x, y) which is a continuous function of the position along every 
straight line in the unit square [0, 1] x [0, 1] but which has an uncountable 
number of two-dimensional discontinuities in every rectangle contained in the 
unit square. The example of W. H. and G. C. Young could easily be modified 
so as to yield a function continuous along every analytic arc but with an un- 
countable number of discontinuities in every rectangle. 

If the number of variables is greater than two the situation becomes even 
worse. As was pointed out by Baire(*), for three variables, and subsequently 
by Hahn(?), for any number of variables, a function f(x1, x2, - - - , may 
be continuous in each variable x; and yet be discontinuous with respect to 
(x1, X2, - - +, Xn) at every point of an (m—2)-dimensional rectangle. In fact 
let g(x1, x2) be a function continuous in x; and x2 separately but discontinuous 
at (0, 0). Then 


S( 1, Xn) = g(%1, 2) 


is discontinuous at every point of the —2)-dimensional region x; =0, x2 =0. 
Finally, Lebesgue(*) has shown that a function f(x, x2, - - - , X,) which is con- 
tinuous in each variable x; separately may be of the (7 —1)st Baire class, al- 
though no worse. 

The problem of considering how much could be said concerning the n-di- 
mensional continuity points of a function f(x1, x2, - - - , X.) which is assumed 
to be continuous in each x; separately was first treated in 1899 by Baire in 
the fundamental paper(5) in which he introduced most of the classic notions 
associated with his name. For the case of two variables his results were com- 
plete. He showed that 


Presented to the Society, December 30, 1941; received by the editors April 16, 1942. 

(*) W. H. Young and G. C. Young, Discontinuous functions continuous with respect to every 
straight line, Quart. J. Math. Oxford Ser. vol. 41 (1910) pp. 87-93. 

(?) R. Baire, Sur les fonctions de variables réelles, Annali di Mathematica Pura ed Applicata 
vol. 3 (1899) pp. 1-122. 

(*) Hahn, Uber Funktionen mehrerer Verdnderlichen, die nach jeder einzelnen Verdnderlichen 
stetig sind, Math. Zeit. vol. 4 (1919) pp. 306-313. 

(*) H. Lebesgue, Sur les fonctions représentable analytiquement, J. Math. Pure Appl. (6) 
vol. 1 (1905) pp. 139-212, cf. pp. 201, 202. 

(*) Loc. cit. Footnote 2. 


83 


| 
| 
| 
| 
| 
| 
| 


84 RICHARD KERSHNER ; [January 


(A) If f(x, y) ts, in the unit square, a continuous function of x, for fixed y, 
and a continuous function of y, for fixed x, then there is a residual set of lines 
parallel to each axis consisting entirely of points where f(x, y) is continuous with 
respect to (x, y). 


Here and throughout the paper a set contained in an interval is called a 
residual set if its complement in that interval is of first category. 

It should be recalled that, in consequence of Baire’s classic theorem, a 
residual set in an interval is uncountably dense in every subinterval. Thus 
the statement obtained from (A) by reading “dense” for “residual” is an im- 
mediate consequence of (A). Actually Baire explicitly displayed only this 
weaker consequence although he actually proved (A). It follows immediately 
from (A) that 


(B) Under the assumptions of (A) every line parallel to an axis contains a 
dense set of points where f(x, y) is continuous with respect to (x, y). 


Clearly (A) is much stronger than (B). For example, (A) does, but (B) 
does not, imply that 


(C) Under the assumptions of (A) the set of points where f(x, y) is discon- 
tinuous with respect to (x, y) has dimension (Menger) at most zero. 


Of course this beautiful theorem (C) was not stated by Baire, since the 
general notion of dimension involved is of a later date. 

Baire also treated the case of three variables in the same 1899 paper. Un- 
fortunately he was not able to generalize the result (A) (which, as is shown 
in §6, is definitive) but only the weaker consequence (B). Of course, in view 
of the fact that f(x, y, z) may be of the second Baire class, it is rather surpris- 
ing that even (B) may be extended to the case of three variables. However, 
Baire was able to show that every plane parallel to a coordinate plane must 
contain a dense set of continuity points for f(x, y, z), assumed continuous in 
x, y and z separately. 

The methods of Baire apparently yielded nothing for f(x1, x2, - - - , Xa) if 
n>3, and it was not until 1910 that Hahn(*) showed that, in spite of the 
increasing Baire class, a generalization of (B) remained valid for any . In 
fact, it was shown that any (”—1)-dimensional hyperplane obtained by fixing 
one coordinate must contain a dense set of continuity points of a function 
(x1, x2, , X,) assumed continuous in each x;. 

The only other-consideration of the problem of which I am aware is a 
paper of Bogel(’) who established the conclusion of (B) under somewhat 
weaker assumptions than those of Baire. 


(*) Loc. cit. Footnote 3. 
(7) K. Bogel, Uber die Stetigkeit und die Schwankung von Funktionen sweier reeler Verander- 
lichen, Math. Ann. vol. 81 (1920) pp. 64-93. 


1943] THE CONTINUITY OF FUNCTIONS 85 


The main purpose of this paper is to show that not only (B) but also the 
definitive result (A) can be generalized to the case of an arbitrary number 
of variables. The precise result, stated here for the case of three variables, is 
the following: 

If f(x, y, 2) ts, im the unit cube, a continuous function of x, and of y, and of z 
then there is a residual set of planes parallel to each coordinate plane, on each of 
which there is a residual set of lines parallel to each possible coordinate axis con- 
sisting entirely of points where f(x, y, 2) 1s continuous with respect to (x, y, 2). 

In particular the result to be established in the case of an arbitrary num- 
ber of variables is strong enough to establish the natural generalization of (C), 
that is, 

If f(x1, x2, + + +, Xn) ts, in the unit cube, a continuous function of each x;, 
then the set of points where f(x1, x2, + + +, Xn) 1s discontinuous with respect to 
(x1, X2, °° * » Xn) has dimension at most (n—2). 

It will be recalled that the set of discontinuities may have dimension n —2 
even if f(x1, x2, , %,) is an algebraic function. 

The extension of the Baire results to the case of more than two variables 
is based mainly on a set-theoretic lemma (Lemma 2) which is proved in §2. 
Section 2 also contains a list of the notations and results of a set-theoretic 
nature that will be needed later. 

Section 3 contains a similar list of notations and results of a function- 
theoretic nature that will be used. These are surprisingly few in number and 
elementary in nature. 

In §4 a greatly simplified proof is given of the Baire result for a function 
of two variables. This is technically unnecessary since the induction proof to 
be given in §5 for the case of any number of variables could be based on the 
completely trivial case »=1 rather than the Baire case »=2, but the Baire 
result does not seem to be so well known as it deserves to be and it seemed that 
a direct modern proof might be useful. 

In §6 it is shown that the results of §5 are the best possible. It seems not 
to have been known even that the Baire result (A) was best possible. 

It might be wondered whether the results of §5 can be strengthened by 
requiring more smoothness, for example, the existence of partial derivatives 
or a 1-dimensional Lipschitz condition, parallel to the axes. This problem is 
treated in §7, where it is shown that an assumption considerably weaker than 
a 1-dimensional Lipschitz condition of any order a>0 is sufficient to ensure 
that the discontinuities of f(x:, x2, - - - , X,) are nowhere dense. This result 
seems to be new even in the simplest case m= 2, although Bogel(*) has a much 
weaker result in this direction in the case n =2. 

Finally, in §8, it is shown that the result of §7 is definitive and in fact 
that no further restrictions on the set of discontinuities are imposed by re- 
quiring any degree of smoothness, short of analyticity, parallel to the axes. 


(*) K. Bogel, Uber partiell differensierbare Funktionen, Math. Zeit. vol.25 (1926) pp.490-498. 


86 RICHARD KERSHNER — [January 


It might be mentioned that the results of this paper could easily be ex- 
tended to quite abstract product spaces but the author’s interests do not lie 
in that direction. 

2. Preliminaries. In this section will be listed certain notations and re- 
sults, some of them classic, that will be used in this paper. Special attention 
is called to the set-theoretic Lemma 2, which seems to be rather powerful. 

Set. The bracket notation [a;--- ] (or [a€S;-- - ]), is used to denote 
the set of all those elements a (of S) for which the specified restrictions 
“..-+” are satisfied. 

Interval. All intervals are understood to be open unless otherwise specified. 
The closed unit real interval [0, 1] is denoted by U or Uj. The closed unit 
n-interval UXUX --- XU is denoted by U,,. If I=(a, b) is a subinterval 
of U then |J| denotes its length b—a. 

F,. A set is called an F,-set if it is the sum of a countable number of closed 
sets. 

Dimension. The recursive (Menger-Urysohn) definition of dimension will 
be used. The empty set has dimension —1. A set S has dimension at most m 
if each neighborhood of every point of S contains another neighborhood of 
that point whose boundary intersects S in a set with dimension at most m—1. 

Category. A subset of U, is called of the first category if it is the sum of a 
countabie number of sets nowhere dense in U,. All other subsets of U, are 
called of the second category. 

Residual. A subset of an interval of U, is called residual in that interval 
if its complement in that interval is of the first category. 


BAIRE’S THEOREM. The empty set is not residual, that is, an interval is of the 
second category. 


Baire’s theorem will be used mainly in conjunction with one of the fol- 
lowing two lemmas of which the first is classic (and trivial) and the second 
seems to be new. 


Lema 1. Let {.S;} be a sequence of sets in U, such that S;is closed and >) S; 
is of the second category in U,,. Then, for some integer k, S, contains a subinter- 
val of U,. 


Proof. Since }-S; is of the second category, not all S; are nowhere dense. 
Thus some 5S; is dense in a subinterval J of U,. But S; is closed, so S, DJ. 

In order to facilitate the statement of the other desired lemma, two defini- 
tions are given next. 

Linearly closed. A set SC U, will be called linearly closed if it intersects 
each line parallel to a coordinate axis of U, in a closed set. Thus S is linearly 
closed if, for each fixed 


a 


1943] THE CONTINUITY OF FUNCTIONS 


the set 


is closed. 

Clearly a closed set is linearly closed, but not conversely. 

Property A. This wili be defined for sets in U, inductively with respect to 
n. A set SC U;, has property A if it is of second category in U;. A set SCU, 
has property A if there is a set RC U,_; such that 


R has property A in U,1 
and such that if 


Xn—1 = (%1, CR 
then the set 
[xn U; (Xn—1, Xn) ES] 


is of second category in U. 

In view of Baire’s theorem it is clear that U,, itself has property A and R 
may be chosen as U,_:. More generally any set which contains a subinterval 
of U, has property A in U,,. 


Lemma 2. Let {S;} be a sequence of sets in U,, such that S; is linearly closed 
and >> S; has property A. Then, for some k, S, contains a subinterval of U,. 


Proof. The proof is by induction on n. If »=1, Lemma 2 follows from 
Lemma 1. Suppose then that the case »—1 of Lemma 2 is valid. 
Since }>S; has property A there is a set RC U,_1 such that 


(1) R has property A in U,_; 

and 

(2) © R implies T = (xn-1, Xn) >_S;] of second category. 
For fixed x,_1ER let 

(3). Tj = T(Xe-1) = (Xn-1, Si], j = 1,2,- 


Then T; is closed since S; is linearly closed. Also }.7;=T is of second cate- 
gory by (3), (2). Thus, by Lemma 1, there is, for each x,1CR, at least one 
integer k and a corresponding interval JC U such that 7,5J. Thus, if for 
each x,_1CR and for each k=1, 2, - - - , one defines 


(4) = max {| T| C C Ud 


(which exists since 7; is closed), then 


(5) for every X,-1 € R there is a k such that A(x,_1, k) > 0. 


87 


88 RICHARD KERSHNER 


Now, for each j=1, 2,---;k=1,2,-+--, let 
Mix = [xn—1 R; k) = 1/j]. 
Then, clearly, x,-1€ M;,, if and only if there is an interval J=J(x,1, k) CU, 
such that 
(6) | I(xn—1, k)| 1/j and x, I(xn-1, &) implies (xn1, € Si. 


It will now be shown that the induction assumption may be applied to the 
sequence of sets M;,,CRCU,-1. First }\M;,.=R. In fact if x,1GR then 
by (5), A(xa—1, &) 21/j for some j,k, and so Thus M;,x has 


property A by (1). 
It remains to show that M;,; is linearly closed. To this end it is sufficient 


to prove that if {xia} is, for fixed 1<iSm—1, a sequence of numbers in U 
such that 


(7) lim x;,, = #; exists 
howe 


and 
then 


It has been seen, in (6), that (8) implies the existence of an interval J=I,C U 
such that | J,| 21/j and 


Xn Th implies (x1, Xa, °° * » Vita, * » Xn) E Si. 


Let x,,, denote the midpoint of Z,. Then there is a subsequence x,,,,, which 
is convergent, say 


Xn hg En. 
Now let x, be a fixed number in U such that 
(9) | — < 1/23 


so that 
| — = (1/2j) — 


for some 5>0. Then, for sufficiently large m, 


| — | <8 
so that 
| — < 1/2). 


Thus x,€J,,, for sufficiently large m. Then 


Uanuary 


1943] THE CONTINUITY OF FUNCTIONS 


for sufficiently large m. Then, since S; is linearly closed, 

by (7). But this is true for any fixed x, in the interval (9) of length 1/7. Thus 


and it has been shown that M;,; is linearly closed. 

The induction assumption may now be applied to the sequence { M;,:} 
and it is found that, for some integers ji, k; the set M;,,z, contains a subinter- 
val of U,_1. Let Ji be a closed subinterval of U,_; contained in M;,,x,. Thus, 
for every point x,_, of the interval J; there is an interval J=I(x,1)C U such 
that 


and 
Now let K;, for j7=1, 2, - - - , 2j:1, denote the interval 
K; = (G — 1)/(21), C U 
of length 1/(27:). Then let N;, for j7=1, 2, - ++, 2j: be the set 


= [xe-1 Ji; K; implies (xn_1, € Si, ]. 


Then it is very easily seen that N; is linearly closed. On the other hand, 
if x,1GJ; then I(x,-1), being an interval of length at least 1/j,; must con- 
tain some one of the intervals K; Then x,.1CWN; for some j, that is, 
Xn-1€)>_N;. Thus N;DJ: and so has property A by Baire’s theorem. 
Thus, again applying the induction hypothesis, some N;, contains a subin- 
terval JzCJiC U,-1. Then the subinterval K;, of U, is contained in S, . 
This completes the proof of Lemma 2. 

Another property of subsets of U, which is connected with property A 
will be useful later and is defined now. 

Property T. Again the definition is inductive for sets in U,. A set SC Ui 
has property I if it is residual in U;. A set-SC U, has property I if there is a 
set RC such that 


R has property in U,_1 
and such that if x,_,;€ R then the set 


is residual in U. 
Clearly a set with property I has also property A, not conversely. How- 
ever, there is a more striking connection expressed by the following: 


89 


90 RICHARD KERSHNER — (January 


LemMaA 3. A set SCU, has property T if and only if U.—S does not have 
property A. 

Proof. This can be proved by a straightforward induction. (If n=1 the 
statement reduces to the definition of a residual set.) The details of the proof 


suggest themselves readily. 
Projection. If SC U, then, for each i=1, 2, - - - , m, the set of all 


(41, Xa, °° Seay » He) © 
such that there exists an x;€ U for which 
(x1, Xe, * » Birr, » Xn) ES 


is called the projection of S on Uji. 
The last set-theoretic lemma that will be needed involves a connection 


between the property A and the dimension of a subset of U,. 


Lemma 4. Let SCU,. Suppose that the projection of S on Ux. for each 
#=1,2,--+-,m, fails to have property 4. Then S has dimension at most n—2. 


Proof. Again this can be shown by a straightforward induction which it 
does not seem necessary to present in detail. It should be mentioned that one 
uses Lemma 3 and two well known facts about residual sets, namely that the 
product of two residual sets is residual and that a residual set is dense. 

3. Further preliminaries. Throughout this section let f(x:, x2, ---, xn) 
=f(xXn-1, X») =f(x,) be defined in U, and be a continuous function of each x; 
for fixed x1, %2, , , Xn. In particular X,) is a continu- 
ous function of x, in U. 

Unicontinuous. For convenience a function f satisfying the above condi- 
tions will be referred to as unicontinuous. 

5.(xn-1). For fixed x,_1€ U,_:, f is a uniformly continuous function of x,, 
that is, for every €>0 there is a 6, such that 


implies 


| f(xn—1 — f(Xn-1, Se. 


It is apparent that for each e>0 and for fixed x,_:, there is a greatest such 6,. 
This greatest 6, will be denoted by 6,(x,-:). 


LemMa 5. For each e>0, n>0 the set of points 


[xn—1 5.(Xn—1) = a] 


is linearly closed. 
Proof. This statement, which expresses the upper semi-continuity of 


| 
u 


1943] THE CONTINUITY OF FUNCTIONS 91 


5.(x1, X2, Xn-1) with respect to each of x1, , X,-1, is reasonably well 
known and in any case is trivial. 
OQ(f, S). If Sis a subset of U, then Q(f, S) will denote 


S) = Lub. [f(x.); x. — g.lb. [f(xn); xn € S]. 


‘Clearly SCT implies 0 S$ Q(f, S) SQ, T). 
OQ(f, If then Q(f, x.) will denote 


Q(f, x.) = g.Lb. [Q(f, S); S is open and x, € S]. 
Lemma 6. If x,G U, and S, is the open cube of side 1/n centered at x, then 
Q(f, xn) = lim Q(f, S,). 
how 


Lemma 7. The function f is continuous at x, if and only if Q(f, x,) =0. 
Lemma 8. For each n>0 the set 
[x. € U; Af, xn) = 9] 


ts closed. 


These three well known facts are stated only for reference. 
4. Functions of two variables. This section will be devoted to the consid- 
eration of a unicontinuous function f(x:, x2) defined in U2. 


THEOREM 1. Let f(x1, x2) =f(x2) be unicontinuous in Uz. Then for each n>0 
the set 


D, = [xz € Us; Q(f, x2) = 2] 
has a nowhere dense projection on U} and U;. 


Proof. In view of the symmetry of the assumptions it is sufficient to prove 
that D, has a nowhere dense projection on U}. 
Let € be fixed in 0<€<7/4. Then let 


5S;= € Ui; 21) 1/j], j 1, 2, 


where 6,(x:) = 5.(x21) was defined in §2. Then S; is closed, by Lemma 5. 
Also >>.S;= U is of second category by Baire’s theorem. Hence, by Lemma 1, 
there is an integer k and an interval JC U such that JCS;. Thus 


(1) x, € I implies 5,(x:) = 1/k. 
It will now be shown that if 
(m, %) =xmCITXU 


then 
Qf, x2) < 4e. 


92 RICHARD KERSHNER : [January 


To this end let #,;€J and #,€ U. Then, since f(x1, #2) is a continuous function 
of x:, there is a 6>0 such that if x,€(4,— 46, #:+6) then 


| f(x1, #2) — 


Choose 5>0 so small that (#,— 46, #:+4)CJ, as may be done since J is open. 
Then, by (1), for each 5, #:+6) CI 


| f(x, — f(x, #2) | Se 


provided 
| — | S 1/k 
Combination of the last two inequalities shows that the function value at any 
point (x1, x2) in the rectangle 
J = — 5, + 5) X (#2 — 1/k, + 1/hk) 
differs from the function value at (x1, x2) by at most 2e. Thus 
Qf, J) 
(I do not pause over the modifications necessary at the boundary of U2.) 
Thus, a fortiori, 
. Qf, ¥2)) S 4e < 2. 

But (41, #2) was any point JX U. 

It has been shown that U; contains a strip JX U which contains no pvint 


of D,. But the argument given was equally applicable to any substrip of U2. 
This completes the proof of Theorem 1. 


THEOREM 2. Let f(x:, x2) be unicontinuous in U2. Let D denote the set of 
points in Uz where f(x1, x2) ts discontinuous. Then D is an F,-set and the projec- 
tion of D on U} (U2) is of first category. 


Proof. If D,,, is the set 
Din = [xs E U2; Ay, x2) 2 1/n] 


then D,,, is closed, by Lemma 8. Also D,,, has a nowhere dénse projection 
on U} by Theorem 1. Since }>D1,,=D, by Lemma 7, the proof is complete. 

The next statement is the one which was called (A) in the introduction. 
It is clearly a restatement of Theorem 2. 


THEOREM 3. Let f(x1, x2) be unicontinuous in Us. Then there is a residual 
set of lines parallel to each axis consisting entirely of points where f(x1, x2) is 
continuous. 

THEOREM 4. The set of discontinuities of a unicontinuous function in U; has 
dimension at most zero. 


| 
4 
4 


1943] THE CONTINUITY OF FUNCTIONS 93 


Proof. This is immediate from Theorem 3 in view of the fact that a resi- 
dual set is dense. Of course it follows from Lemma 4 also. 

5. Functions of many variables. This section will be devoted to a proof 
of the appropriate generalization of the results of the last section for a uni- 
continuous function on U,. 

TuHeEoreM 5. Let f(x:, +--+, Xn) =f(xa) be unicontinuous in U,. Then for 
each n>0 the set 


D, = [xn Un; Af, xn) = 0) 
has a nowhere dense projection on U,_, for eachi=1,2,---,n. 


Proof. Clearly it is sufficient to prove that the projection of D, on U7_, is 


nowhere dense. 
Let ¢€ be fixed in 0<¢€<7/8. Then let 


S; = [xn—1 E 5.(Xn—1) 2 1/j], j = 1, 


Then S; is linearly closed by Lemma 5. Also >> S;= U,-1 has property A in 
U,-1 by Baire’s theorem. Hence, by Lemma 2, there is an integer k and an 
interval J,1C U,_1 such that J,-1CS;. Thus 

(1) Xn—-1 © implies 5,(xn—1) 2 1/k. 


Now suppose the theorem true for functions unicontinuous in U,_. For 
each fixed x, € U let f.,(x,-1) denote the function f(x,-1, xn) of x,-1. Let {xn.a} 
be a dense sequence in U. Finally let 


De = © Uns; = 


Then by the induction assumption D* has a nowhere dense projection on U,_2 
and, a fortiori, D® is nowhere dense in U,:. Thus, by Baire’s theorem, 


h 
> D, does not contain J,_1. 
h 


Thus there is a point ¥,-1€J,_1 such that ¥,_: is not in any D}, that is, so that 


(2) Qf Xn~1) < € 
It will now be shown that for every x,€ U, 


(Xn-1 Xn)) < 8e. 


In fact let #,€ U be fixed. Since {x,,,} is a dense sequence there is some fi 
such that 


But, since ¥,1€J,-:, this means, by (1), 


RICHARD KERSHNER 


| tn n,n, | = 5.(Xn—1) 
so that 
(3) | Fn) — fl%n-1» | 


by the definition of 5.(x,-1). Now, by (2) and the definition of Q(f2,,,, ¥n-1), 
there is an open neighborhood J*_, of %,-1 in U,-1 such that 


Init) < 2e. 
Thus 
(4) Xn—1 € implies | Xn,h,) | < 2. 


Now suppose J*_, has been chosen so “small” that J*_,CJ,—: which is clearly 
permissible. Then 


In other words 
(5) | Xn | s 1/k implies | f(Xn—1» %n) | Se. 


Combination of (3), (4), (5) gives 
(6) | — Fn) | < 4e 
whenever 
(7) Inia and | — 1/k 
where h, was determined so that 

| — S 1/2k. 


In particular if |Z. —#,| < 1/2k the second requirement (6) is automatic. Thus 
for any point (x,-1, X,) of the neighborhood 


In = Inia X (Gn — 1/2k, + 1/28) 
of (%,-1, #,) the inequality (6) is valid, Then, 
2(f, In) < 8 
and, a fortiori, 
Qf, Zn) < 8e. 
It has now been shown that the set 
(8) [xn € Un; Qf, xn) < 8e] 


contains the line ¥,_1 X U. But (8) is an open set (cf. Lemma 8) in U, and so 
(8) contains a strip J,1 X U for some interval J,1C U,-1. 


94 [January 
| 
| 


1943] THE CONTINUITY OF FUNCTIONS 95 


The above argument gives the existence of a strip J,_:.X U in the comple- 
ment of D,CDs, (since 8e<7). But this argument was equally applicable to 
any substrip of U,. This completes the proof of Theorem 5. 

Exactly as in the case of two variables the following is an immediate con- 
sequence. 

THEOREM 6. Let f(x,) be unicontinuous in U,. Let D denote the set of points 
in U, where f(x,) ts discontinuous. Then D is an F,-set and the projection of D 
on U,_, is (for each i=1, 2, - - - , m) of first category. 


This result, which is the natural extension of Theorem 2, does not obvi- 
ously imply the desired generalization of Theorem 4. However, in view of 
Lemma 2, Theorem 6 is equivalent to the following: 


THEOREM 7. Let f(x,) be unicontinuous in U,. Let D denote the set of points ~ 
in U, where f(x,) is discontinuous. Then the projection of D on U,_, (for each 
i=1,2,---, m) does not have property A. 


Proof. In view of Lemma 1, an F,-set is of second category if and only if 
it contains an interval. In view of Lemma 2 and the fact that a closed set is 
linearly closed, an F,-set has property A if and only if it contains an interval. 
Thus, for F,-sets, first category is equivalent to the negation of property A. 
This shows the equivalence of Theorems 6 and 7. 

Of course Theorem 7 may be stated in a positive fashion similar to Theo- 
rem 3. It is this statement which was displayed, for the case m =3, in the in- 
troduction. Finally 

THEOREM 8. The set of discontinuities of a unicontinuous function in U, 
has dimension at most n—2. 

Proof. This is immediate from Theorem 7 and Lemma 4. 

6. An example. In this section it will be shown that Theorem 6 describes 
the possible sets D completely. This is done by proving the following. 


THEOREM 9. Let D be any F,-set in U, such that the projection of D on U,i_4 
(for eachi=1, 2, - - - , m) is of first category. Then there exists a unicontinuous 
function on U,, for which D is the set of discontinuity points. 


The example which proves Theorem 9 will be constructed with the help 
of certain auxiliary functions whose existence is demonstrated first. 


Lemma 9. Let D be any closed set in U, such that the projection of D on U,i1 
(for each i=1, 2, --+, m) is nowhere dense. Then there exists a function f=fp 
on U,, such that 

(a) OSfpS1; 

(b) fp is unicontinuous on U,; 

(c) Q(fp, xn) =0, if 

(d) Qfo, x.) =1, if x. ED. 


= 


96 RICHARD KERSHNER | [January 


Proof. By an oriented closed cube K in U, will be understood a closed 
n-cube with faces parallel to the coordinate hyperplanes. With each oriented 
cube KC U, let there be associated a function gx with the following proper- 
ties: 

(1) gx is defined and continuous on K; 

(2) gx=0 on the boundary of K; 

(3) OSgxS1; 

(4) ge=1 at the midpoint of K. 

For example gx(x,) might be chosen proportional to the distance from x, 
to the boundary of K. 

Let a line, parallel to the ith coordinate axis of U, (¢=1, 2, - - +, m) be 
called an 7-grid line if it contains a point of D. Let D# be the set of all points 
which lie on some i-grid line. Then D# is clearly a closed set since D is closed. 
Also D¥ has the same projection on U,#_, as D, hence D¥ is nowhere dense 
in U,. 

Now let D*=)>-%_,D#. Then D* is a nowhere dense closed set in U,. Let 
C* = U, —D*, so that C* is an open set dense in U,. It is well known that any 
open set in U, is the sum of a countable number of nonoveriapping oriented 
closed cubes. Thus C*=)_K; where K; is an oriented closed cube in U, and 
K; and K; have at most boundary points in common. 

Notice that every point of D is a limit point of midpoints of the K;. In 
fact >.K;=C* is dense in U, and so each point of D is a limit point of points 
in some collection of K;. But any finite collection of K; form a closed set 
disjoint from D* and, a fortiori, disjoint from D. Thus each point of D is 
a limit point of points from distinct K;. But the diameters of any infinite 
collection of K; obviously must approach zero since }\K;C U,. Thus each 
point of D is a limit point of midpoints of K;. 

Now let {P,} be a sequence of points in D such that every point of D is 
either a point P; or a limit point of points P;. The existence of such a sequence 
is quite obvious. Let a subsequence {K,,,;} of the cubes K; be chosen in such 
a way that the midpoint of K,,,; is at a distance at most 1/m from P;. Tne 
fact that {K,,;} exists is clear from the preceding paragraph. It is clear that 
by proceeding inductively with respect to m+ 7 the K,,,; might be chosen as 
all distinct, but this is not essential. 

Finally let f=fp be defined in U, by 


if © Km, i; 
(5) 0 if Xan U, Ku, ;- 
mei 
It is to be shown that this function fp satisfies the requirements (a)—(d) of 
Lemma 9. 


First of all (a) is obvious from (5) and (3). 
In order to prove (d) let P be a point in D. Then either P is a point P;or P 


| 
al 


1943] THE CONTINUITY OF FUNCTIONS 97 


is a limit point of such points. In the first case P= P; then P is a limit point 
of the midpoints of the cubes K,,,; for m=j, j7+1,---. Thus P is a limit 
point of points where fp =1 (in view of (5), (4)). But also P is a limit point of 
boundary points of K,,,; where fp=0. Thus Q(fp, P) =1. In the other case, 
that P is the limit of some sequence { P;,}, then P is also the limit of the mid- 
points of K;,,;; and again Q(fp, P) =1. This establishes (d). 

To prove (c), let P be a point where Q(fp, P) >0. Then P is not an interior 
point of any cube K,,,; by (1). Thus fp(P) =0 (and in fact P is a limit point 
of points where fp = 0). But, since Q(fp, P) = 5>0, P must also be a limit point 
of points where fp > 5/2 and hence a limit point of points in some subsequence 
{Km,3;} of {Km,;}. This subsequence cannot contain only a finite number of 
' distinct cubes since, if it did P would clearly be an interior point of some Kx, ;. 
Thus P is the limit point of midpoints of an infinite sequence { Kn,,;,} of dis- 
tinct cubes. But each such midpoint is at a distance at most 1/m; from P;,. 
Thus P is a point P;,; or a limit point of such points and, in either case, PED, 
since D is closed. 

Finally, to prove (b), the value of fp along lines parallel to the axes must 
be considered. If such a line is a grid line then it is contained entirely in D* 
and fp=0 by (5). If such a line is not a grid line it is contained entirely in 
U,—D so that fp is continuous at every point by (c) and, a fortiori, is con- 
tinuous along the given line. 

The proof of Lemma 9 is now complete and it is now very easy to prove 
Theorem 9. 

Proof of Theorem 9. Let D be any F,-set in U, having a first category 
projection on U,i_;. Then, by the definition of an F,-set, D=)}°D;, where D; 
is closed. But D; has a nowhere dense projection on each U,'_;. (In fact if a 
closed set is dense in some interval it contains the interval and so is not of 
first category by Baire’s theorem.) Thus each D; satisfies the requirement of 
the set D in Lemma 9. Let fp, be a function associated with D; satisfying 
(a)—(d) of Lemma 9. Then let 


fo/3* 


Then the series defining f is uniformly convergent by (a). Thus f is unicontinu- 
ous since all fp, are unicontinuous. Also f is continuous on U.—D=U,—)>_D; 
since each fp, is continuous on U,—D;)U, —)>oD,. Finally, at any point P 
of D, f is discontinuous since the convergence factors 1/3‘ were chosen suffi- 
ciently rapid that “cancellation” of the discontinuities is impossible in view 
of (d). 

7. The consequence of stronger assumptions. This section and the follow- 
ing one are devoted to an investigation of the consequence of assuming 
stronger smoothness than simply continuity parallel to the axes. In order 


| 
] 
| 


98 RICHARD KERSHNER ~ [January 


to state the precise requirement that will be considered in this section, a 
definition is needed first. 

Let f(x,) be a unicontinuous function in U,, so that f(x,-1, a») is a uni- 
formly continuous function of x, in U. Let 


o(Xn—1, 6) = | f(xn-1, Xn ) Xv Xn’) | 


max 

| 

Then w(x,-;, 5) is, for each fixed x,_1, a monotone non-decreasing function 
of 6 in 0<6S1 with w(x,_1, 0+0) =0. 

Condition S: The function f(x,-1, x.) of x, is said to satisfy condition S 

if there exists a sequence { cm (5) } of functions w,,(5) defined for 0<6<1 such 


that 
@m(O + 0) = 0 


and, for every x,-1€ U,_; there is an integer m such that 


For example if, for each fixed x,-1, f satisfies a Lipschitz condition of 
order one, such a sequence is provided by w,,(5) =m- 56. If it is known only 
that for each x,-1, f satisfies a Lipschitz condition of some positive order then 
Wm(5) =m-5"™ is a sequence of the required type. By considering such se- 
quences as W»,(5) =1/ | log 5| 1/™ it is seen that condition S is very weak. How- 
ever the theorem to be proved next, taken in conjunction with the example of 
the preceding section, implicitly shows that condition S is not always satisfied. 


THEOREM 10. Let f(xn) be a unicontinuous function on U, which satisfies 
condition S with respect to any one of its variables as x,. Let D denote the set of 
points in U,, where f(x») is discontinuous. Then D is an F,-set and the projection 
of D on is (for eachi=1, 2, - , m) nowhere dense. 


Proof. The proof is by induction based on the completely trivial case n = 1. 
Suppose then that the theorem is true for the case n —1. 

Now let 
S; = [xn—1; w(Xn—1, 6) < w (8) J. 
Then 5S; is easily seen to be linearly closed. Also }>S;= U,-1 in view of condi- 
tion S. Thus Lemma 2 can be applied to the sequence {.S;} and one finds that 
there is an integer k and a subinterval such that 

Let #,€ U be fixed. Then f(x,-:, %,), considered as a function of x,_;, has, 
by the induction hypothesis, discontinuities at a set of x,,; with a nowhere 
dense projection and, a fortiori, nowhere dense in U,_;. Thus there is an in- 
terval J,1CIn—1 where f(x,-1, £n) is a continuous function of x,-;. Thus if 
%,-1CJ,-, then for every €>0 there is a 5, such that 


(1) | Xn—-1 — Xn—1| < implies Xn—-1 and 
| En) — f(Xn-1, En) | < €/2. 


U . 
\ 
4 
| 
} 
| 
4 


1943] THE CONTINUITY OF FUNCTIONS 


Now let 5>0 be chosen so that 
¢/2 
which is possible since w,(0-+-0) =0. Then, for any CIn-1 C Si, 
w(Xn—1, 5) S €/2. 
Thus, by the definition of w(x,—:, 4), 
(2) | &, — | <6 implies | f(xn-1, — f(xn—1, S €/2. 
From (1) and (2), one has 
| Fn) — f(Xn1 | Se 


if | t,1—*n-1| <5, and | #,—x,| <5. Hence f is continuous at (#,-1, #,). But 
(#n-1, ,) was arbitrary in J,1X U. 

It has been shown that U, contains a strip J,: X U consisting entirely 
of continuity points of f. But the argument leading to the existence of this 
strip is equally applicable to any substrip of U,. Thus the discontinuity points 
of f have a nowhere dense projection on U;7_,. The symmetry of the assump- 
tions shows that the proof is complete. 

8. Another example. In this section it will be shown that the conclusion 
of Theorem 10 cannot be strengthened (as far as a restriction on D is con- 
cerned) even if the assumptions are strengthened to the point of requiring: - 
the existence of all derivatives along any line parallel to an axis. The precise 
statement of the theorem to be established follows. 


THEOREM 11. Let D be any F,-set in U,, such that the projection of D on U,i_, 
(for each i=1, 2, +--+, m) is nowhere dense. Then there exists a function f(xn) 
on U,, such that f(x,), considered as a function of any one of the variables (for 
fixed values of the remaining n—1 variables) has all derivatives and such that D 
ts the set of discontinuity points of f(xn). 


Proof. The construction of the example which proves Theorem 11 is simi- 
lar to the construction in the proof of Lemma 9 and will not be given in quite 
so much detail. 

First let D be the closure of D. Then D has a nowhere dense projection on 
each U,_;. As in the proof of Lemma 9 let D¥ be the set of points lying on 
some “j-grid line,” that is, some line parallel to the x;-axis containing a point 
of D. As before, let D* =)? ,D¥. Finally let C*=U,—D*, so that C* is a 
dense open set in U,. Now C*=)_K; where K; is a closed oriented cube and 
K; and K; have at most boundary points in conimon. 

With any closed oriented cube K C U, let there be associated a function gx 
which, in addition to satisfying the requirements (1)—(4) of §6, is infinitely 
differentiable along any line parallel to an axis, with all derivatives vanishing 
at the boundary of K. (Of course, at the boundary of K these derivatives are 


100 RICHARD KERSHNER 


all one-sided derivatives calculated with respect to the interior of K.) For 
example, if K is the unit cube U, then gx might be chosen as 


gx(%1, , Xn) = exp (8) [] exp (— 1/x;)-exp (-— 1/(%; 1)’). 
i=1 

Now let the F,-set D be written as D =) \D, where D;, is closed. For each k 
let {Px,;} be a sequence of points in D, such that every point of D, is either 
a point P,,; or a limit point of such points. Also let { Km,x,5} be a subsequence 
of the cubes K; such that the midpoint of K,,,x,;is at a distance at most 1/m 
from P,,;. This time it is important to choose the K,»,,x,; all distinct. 

Now a function f(x,) satisfying the requirements of Theorem 11 can be 
defined by 


(1/k) gx if xn © m = max (k, Di 


0 if x» U, — >> 
mek,7 


S( xn) 


The fact that this function is discontinuous on D and continuous on U,—D 
is verified very much as it was shown that the function fp defined in the proof 
of Lemma 9 had D as its set of discontinuities. In the present case Dx turns 
out to be the set of points P where Q(f, P)21/k. To see that f is infinitely 
differentiable along any line parallel to a coordinate axis, notice that such a 
line lies either in D* or in C*. In the first case f=0. In the second case the 


given line is at a finite distance 5>0 from the closed set D* and, a fortiori, 
at a distance at least 5>0 from any point of DC D*. Thus at most a finite 
number of the cubes K,,.,x,;, with m2 max (k, j), are intersected by the given 
line. Thus along the given line f is zero save for a finite number of nonover- 
lapping intervals in which it is modified by inserting an infinitely differentia- 
ble piece with all derivatives vanishing at the end points. This completes the 
proof. 


THE Jonns Hopkins UNIVERsITY, 
BALTIMORE, Mp. 


| 
| 
{ 


THE GAUSS-BONNET THEOREM FOR 
RIEMANNIAN POLYHEDRA 


BY 
CARL B. ALLENDOERFER AND ANDRE WEIL 


TABLE OF CONTENTS 
Section 
. Introduction 
. Dual angles in affine space 
. Dual angles in Euclidean space 
. Convex cells and their tubes 
. Curved cells and their tubes 


1. Introduction. The classical Gauss-Bonnet theorem expresses the “cur- 
vatura integra,” that is, the integral of the Gaussian curvature, of a curved 
polygon in terms of the angles of the polygon and of the geodesic curvatures 
of its edges. An important consequence is that the “curvatura integra” of a 
closed surface (or more generally of a closed two-dimensional Riemannian 
manifold) is a topological invariant, namely (except for a constant factor) 
the Euler characteristic. 

One of us(') and W. Fenchel(?) have independently generalized the latter 


result to manifolds of higher dimension which can be imbedded in some 
Euclidean space. For such manifolds, they proved a theorem which we shall 
show to hold without any restriction, and which may be stated as follows: 


THEOREM I. Let M* be a closed Riemannian mantfold of dimension n, with 
the Euler-Poincaré characteristic x; let dv(z) be the Riemannian volume-element 
at the point with local coordinates 2* (1SuSn); let g,, be the metric tensor, 
g=|gu| its determinant, Ry,».»», the Riemannian curvature tensor at the same 
point; and define the invariant scalar V(z) by: 


(1) for n even 
W(s) = 0 for n odd. 


Presented to the Society, December 30, 1941 under the title A general proof of the Gauss- 
Bonnet theorem; received by the editors April 23, 1942. 

(*) C. B. Allendoerfer, The Euler number of a Riemann manifold, Amer. J. Math. vol. 62 
(1940) p. 243. 

(*) W. Fenchel, On total curvatures of Riemannian manifolds. (1), J. London Math. Soc. voi. 
15 (1940) p. 15. The concluding words of this paper show that the author contemplated an exten- 
sion of his method which was to give him “a formula of Gauss-Bonnet type.” We do not know 
whether such an extension has been published, or even carried out. 


101 


d 
Page 
6. Gauss-Bonnet formula for imbedded cells 116 
7. Gauss-Bonnet formula for Riemannian polyhedra 121 


C. B. ALLENDOERFER AND ANDRE WEIL 


x= 


Here and throughout this paper a sign such as ).,,, indicates summation 
over all indices y;, v;, these indices running independently over their whole 
range; and e™ is the relative tensor e“2****» defined by «“=+1 if 
(41, M2, *, Mn) is an even permutation of (1, 2,---.m), «=—1 if it 
is an odd permutation, and e™ =0 otherwise. Owing to the symmetry proper- 
ties of the curvature tensor it is readily seen that each term in our sum occurs 
2"(n/2)! times or a multiple of that number; for that reason, in our arrange- 
ment of the numerical factor, the sign 2 is preceded by the inverse of that 
integer, so that the sum under 2, together with the factor immediately in 
front of it, is (except for 1/g) a polynomial in the R’s with integer coefficients; 
similar remarks apply to the other formulae in this paper. On the other hand, 
it may be convenient, for geometric reasons, to define the curvature as 
K=w,/2-¥(z), where w, is the surface-area of the unit-sphere S* in R**', 
so that the curvature is 1 for that sphere if m is even(*); Theorem I then gives 
[Kdv(z) =w,-x/2. 

It does not seem to be known at present whether every closed Riemannian 
manifold can be imbedded in a Euclidean space. However, the possibility 
of local imbedding, at least in the analytic case, has been proved by E. Car- 
tan(‘), and this naturally suggests applying the same method of tubes, which 
was developed for closed imbedded manifolds in the above-mentioned pa- 
per('), to the cells of a sufficiently fine subdivision of an arbitrary manifold. 
This gives a theorem on imbedded cells which is the n-dimensional analogue 
of the Gauss-Bonnet formula; the corresponding theorem for polyhedra will 
emerge as the main result of the present paper; except for details which will 
be filled in later, this can be stated as follows. 

In a Riemannian manifold M*, let M* be a differentiable submanifold of 
dimension r <n; we assume that M’ is regularly imbedded in M"*, that is, tak- 
ing local coordinates {* on M* and z* on M*, that the matrix ||ds*/d¢4|| is of 
rank r. We introduce the following tensors. First, we write: 


Oz" Oz dz" 


( ) Ruy ara are ara ara 


(3) It will be noticed that for 2 even the numerical factor in 1/2 - w,- ¥(s) as calculated from 
(1) has, owing to the value of 1/2 - w, =2*/?- (2x)*/2- (n/2)!/n!, a simple rational value. 

(*) E. Cartan, Sur la possibilité de plonger un espace riemannien donné dans un espace 
euclidien, Annales de la Société Polonaise de Mathématique vol. 6 (1927) p. 1. This followed 
a paper by M. Janet under the same title, ibid. vol. 5 (1926) p. 38, where an incomplete proof 
of the same theorem is given; Janet’s proof was completed by C. Burstin, Ein Beitrag sum 
Problem der Einbettung der Riemannschen Réume in euklidischen Réiumen, Rec. Math. (Mat. 
Sbornik) N.S. vol. 38 (1931) p. 74. 


102 [January 
Then: 
| 
} 
i 
i 
i 


1943] THE GAUSS-BONNET THEOREM 103 


those being the components of the curvature tensor of the imbedding mani- 
fold M* in the directions which are tangent to M*. Next, let x be a normal 
vector to M’ in M*, with the covariant components x,; we write 


are the Christoffel symbols in M*. The A’s are linear combinations of the 
coefficients of the second fundamental form of M* in M*. We now introduce, 
for 0 <r, the combinations 

fi) 


1 


where ¥ is the determinant of the metric tensor y;; on M*. Let now S*-*-! 
be the unit-sphere in the normal linear manifold to M’ at calling 
an arbitrary point on that sphere, that is, an arbitrary unit-vector(®), normal 
to M’ at {, we denote by dé the area-element at — on S*~*-'; and finally, we 
consider the expression (®) 


2 (n — 2)(n — 4)--- (n — 2f) 


which can be integrated over the whole or part of the sphere S*-*—'. 

Let now P* be a Riemannian polyhedron, that is, a manifold with a 
boundary, the boundary consisting of polyhedra P; of lower dimensions (pre- 
cise definitions will be given in §7); 2* and £* being local coordinates in P* 
and P;, respectively, in the neighborhood of a point £ of P\, we consider the 
set I'({) of all unit-vectors £ at that point, with components £, such that 
>. .£,:d2"/ds <0 when the derivatives dz*/ds are taken along any direction 
contained in the angle of P* at ¢ (for more details, see §§6-7). I'(¢) is found 
to be a spherical cell, bounded by “great spheres,” on the unit-sphere S*-*-* 
in the normal linear manifold to Pj, at {, and is what we call the “outer angle” 
of P* at ¢. 

(5) We consistently (except for a short while in the proof of Lemma 8, §7) make no distinc- 
tion between vectors and their end points, and therefore none between unit-vectors and points 


on the unit-sphere. 

(*) In view of the geometrical nature of the problem, one may suspect that the nu- 
merical coefficients in Y are connected with areas of spheres; and bringing out such connec- 
tions may point the way to geometrical interpretations of our formulae. For instance, we have: 
T(n/2)/(2- 29 - f(r —2f) —2)(m—4) —2f)) wag * (2f) —2f) 127). 


-dé, 


(5) = 


where 
—— 


104 C. B. ALLENDOERFER AND ANDRE WEIL [January 


Our main theorem, which includes Theorem I as a particular case, ex- 
presses in terms of the above quantities the inner characteristic x'(P") of P*, 
that is, the Euler-Poincaré characteristic of the open complex consisting of 
all inner cells in an arbitrary simplicial or cellular subdivision of P*; our 
methods would enable us to give a similar expression for the ordinary char- 
acteristic. The result is as follows: 


THEOREM II. P* being a Riemannian polyhedron, with a boundary consisting 
of the polyhedra P;,, we have: 


n—1 


r=0 A 


It will be shown in §6 how the method of tubes, applied to an imbedded 
cell in a Euclidean space, leads directly to the formula in Theorem II for 
such a cell. Sections 2-3 give the necessary details on dual angles and outer 
angles, and contain the proof of the important additivity property for outer 
angles in affine space, which is stated in Theorem III; this may be considered 
as a theorem in spherical geometry, and is a wide generalization of some 
known results on polyhedra in R*; it also includes some results of Poincaré 
on the angles of Euclidean and spherical polyhedra. Sections 4-5 are mainly 
devoted to the definition of the tube of a curved cell, and the investigation 
of its topological properties. 

The proof of the main theorem then follows in §7, where it is shown how 
the additivity property for outer angles, proved in §2, implies an additivity 
property for the right-hand side in the formula in Theorem II; hence Theorem 
II is true for a polyhedron P” if it is true for every polyhedron in a subdivision 
of P*. In particular, it is true for an analytic cell because, by Cartan’s theo- 
rem, every cell in a sufficiently fine subdivision of such a cell is imbeddable; 
by an elementary approximation theorem of H. Whitney, it is therefore true 
for an arbitrary cell. Hence it holds for every polyhedron which can be tri- 
angulated into cells; but it is known that every polyhedron can be so triangu- 
lated, and this completes the proof. Owing to the very unsatisfactory condi- 
tion of our present knowledge of differentiable polyhedra, it has been found 
necessary to include, in §7, the proof of some very general lemmas on the sub- 
divisions of such polyhedra; and the section concludes with some remarks 
about the validity of Theorem II for more general types of polyhedra than 
those we are dealing with. : 

2. Dual angles in affine space. It has often been observed that the word 
“angle” as used in elementary geometry is ambiguous, for it sometimes 
refers to a subset of the plane bounded by two rays and sometimes to what 
essentially is a 1-chain on the unit-circle. In order to preserve analogies with 
elementary geometry, we shall here use the word “angle” both for certain 
subsets of an affine vector-space R* and for certain (m —1)-chains in the mani- 


1943] THE GAUSS-BONNET THEOREM 105 


fold of directions from O in R*; this will be done in such a way that no con- 
fusion may arise. Even in affine space we shall adopt the unit-sphere S*—', 
that is, the surface }>,(x*)?=1, as a convenient homeomorphic image of the 
manifold of directions from O in R*; in the present section, any other such 
image could be used just as well to the same purpose. 

In this section, R* will denote an affine n-dimensional space over the field 
of real numbers. Assuming that a basis has been chosen in R® once for all, 
we denote by x* (1 Sun) the components of a vector x in R* with respect 
to that basis. As functions of x, the components x“ are linear forms in R"; 
and they constitute a basis for the vector-space R* of all linear forms 
(y, x) =>>,y,-x" over R"; the y, are then the components, with respect to that 
basis, of the form (y, x), or, as we may say for short, of the form y. We call 
R* the dual space to R*. We shall consider linear manifolds V’ in R*, which, 
throughout §§2-3, should be understood to contain O; throughout this paper, 
the superscript, when used for a space or manifold, should be understood to 
indicate the dimension. To every V* in R" corresponds in R* the dual mani- 
fold V"~’, consisting of all linear forms which vanish over V’ (this should not 
be confused with the dual space to V* when the latter is considered as an 
affine space). 

Convex angles in R" may be defined in two ways, which may be considered 
as dual to each other: (a) a convex angle is the set of points x in R* which 
satisfy a finite number of given inequalities (b,, x) 20; (b) a convex angle is 
the set of points x =)>,u,-a,, where the a, are a finite number of given points, 
and the numbers u, take all values greater than or equal to 0. It is well known 
that these two definitions are equivalent. Throughout this paper, all angles 
will be convex angles, and we shall often omit the word “convex.” 

A convex angle C is said to be of dimension r and of type s if r and s are 
the dimensions of the smallest linear manifold V" such that V’DC and of the 
largest linear manifold V* such that CDV’; if r=s, the angle reduces to V" 
and will be called degenerate; otherwise r>s. In the notation of angles, the 
superscript will usually denote the dimension and a Latin subscript the type 
of the angle whenever it is desirable to indicate either or both. A Greek sub- 
script will be used to distinguish among angles of the same dimension and 
type. 

Let C be an r-dimensional angle, contained in the linear manifold V’; a 
point of C will be called an inner point if there is a neighborhood of that point 
in V’ which is contained in C; such points form a subset of C which is open 
with respect to the space V*; if C is defined by the inequalities (b,, x) 20, a 
point a in C will be an inner point if, and only if, all those of the forms (8,, x) 
which do not vanish on V* are greater than 0 at a. For r=0, V’ and C both 
reduce to the point O, which is then considered as an inner point of C. The 
points of an angle which are not inner points constitute a set which is the 
union of angles of lower dimension; such points are limits of inner points. 


106 C. B. ALLENDOERFER AND ANDRE WEIL [January 


LEMMA 1. Let C be a convex angle of dimension r, with at least one point a 
in the open half-space (b, x) >0; then its intersection, D, with the closed half-space 
(b, x) 20 is a nondegenerate angle of the same dimension. 


For all points of C, in a sufficiently small neighborhood of a, will be in D; 
among those points there are inner points of C, forming an open set in the V* 
which contains C, so that D is r-dimensional. Moreover, D contains a and 
not —a, and so cannot be degenerate. : 

Let C be a convex angle of dimension m; a finite set D of distinct convex 
angles Cy (OSrsm;125X2N,) will be called a subdivision of C into convex 
angles whenever the two following conditions are fulfilled: (a) every point of 
C is an inner point of at least one Cy in D; (b) if two angles Cj, Cj in D are 
such that there is an inner point of Cy which is contained in Cj, then Ci CC}. 
From (b), it follows that nc two distinct angles in D can have an inner point 
in common. The angles in 1) can be considered, in the usual way, as forming 
a combinatorial complex. A subdivision of an angle C is called degenerate if 
it contains a degenerate angle V’ of a dimension r>0; as O then is an inner 
point of V* and is in all the angles of D, it follows that all those angles con- 
tain V* and are of type at least r, as well as C itself. If D is nondegenerate, 
it is easily shown to contain angles of all dimensions less than or equal to m 
and greater than or equal to (, and in particular the angle C® which is the 
point O. An angle Cj in D wil) be called an inner angle if one of its points is 
an inner point of C; otherwise we call it a boundary angle. All angles Cy of 
the highest dimension in D are inner angles. 

Let (b,, x) be linear forms in R*, « running over a finite set of indices J; 
for every partition of IJ into three parts K, L, M, consider the angle defined 
by x) 20 (kxEK), (b,, x) $0 (AGEL), x) =0 (uM); all those angles, or 
rather those among them which are different from each other, form a sub- 
division of R*. If this process is applied to the set of all linear forms which 
are needed to define some given angles C, C’, C’’, - - - in finite number, then 
the angles of the resulting subdivision which are contained in C form a sub- 
division of C; and the same applies to C’, C’’,---. 

The intersection of a convex angle of dimension r 21 with the unit-sphere 
S*-1 in R*, or, as we shall also say, its trace on S*—', will be called a spherical 
cell of dimension r —1. If the angle is degenerate, so is the cell. A nondegener- 
ate cell is homeomorphic to an “element” (a closed simplex) of the same di- 
mension. A degenerate cell is a sphere. 

Let I be the trace of C” on S*—'; let D be a subdivision of C”. The traces 
T,~' of the angles CX of D on S*—! for 1SrSm form a subdivision of T into 
cells, and so, if D is nondegenerate, into topological elements. We can there- 
fore apply elementary results in combinatorial topology to the calculation of 
the Euler-Poincaré characteristic of such subdivisions. 


LEMMA 2. Let D be a nondegenerate subdivision of the angle C™, consisting 


1943] THE GAUSS-BONNET THEOREM 107 


of the angles C, (OSrsim;1SXSN,); let N} be the number of inner angles of 
dimension r in D; and write: 


xD) = (- xO) = (- 
Then, if C™ is nondegenerate, we have x(D) =0, x'(D) = (—1)”; if C™ is degener- 
ate, x(D) =x'(D) =(—1)". 


This follows at once from the well known value of the characteristic for 
elements and for spheres, and from the fact that No=1, Ng =0. 

Let now C be a convex angle in R*, defined as the set of all points 
x=) ,u,-a,, where the a, are given points and the u, take all values greater 
than or equal to 0. A linear form (y, x) will be less than or equal to 0 on C 
if, and only if, (y, —a,) 20 for all p; the set of all points y in R* with that 
property is therefore a convex angle C. The relationship between C and C is 
easily shown to be reciprocal; we shall say that C and C are dual to each 
other. If two angles C, D are such that CDD, then their dual C, D are such 
that CCD. If an angle is degenerate and reduces to the linear manifold V’*, 
then its dual is the dual manifold V*-'. It follows that if V’-DCDV*, then 
V--"CCCV"-*; if, therefore, C is of dimension r and type s, its dual C is of 
dimension »—s and type n—r. 


Lemma 3. Let C be the dual of an angle C of type s, and CD V*. A point b 
of C is an inner point of C if and only if the form (b, x) is less than 0 at all 
points of C other than those of V*. 


Let C, as above, be the set of points x=) ,u,-a, when the u’s take all 
values greater than or equal to 0. Then C is defined by the inequalities 
(y, —a,) 20, is of dimension n—s, and is contained in the dual V*~* to V*. 
We have seen that 6 is an inner point of C if and only if (6, —a,) >0 for all 
those values of p for which (y, —a,) does not vanish on V"~*, that is, for which 
a, does not lie in V*; this obviously implies the truth of our lemma. 

We now introduce the unit-sphere S*-! in R* (to which our earlier re- 
marks about spheres apply); and we shall use the subdivisions of 5*-', in- 
duced by the subdivisions of R* into convex angles, in order to define chains 
on 5*-! in the sense of combinatorial topology. All chains should be under- 
stood to be (n—1)-chains on 5*—' built up from such subdivisions, the ring of 
coefficients being the ring of rational integers. We make the usual identifica- 
tions between certain chains belonging to different subdivisions, by the fol- 
lowing rule: if D’ is a refinement of D, and a cell '*—' of D is the union of cells 
A3~* of D’, we put '*-!=)>,A*"'. With that convention, any n-dimensional 
angle C defines a chain, namely, the cell T=CS*, taken with coefficient 
+1 in a suitable subdivision. An angle of dimension less than n is considered 
as defining the chain 0. Angles being-given in R* in finite number, there are 


108 C. B. ALLENDOERFER AND ANDRE WEIL [January 


always subdivisions of 5*—! in which the traces of all those angles appear as 
chains: we get such a subdivision by making use of all the linear forms which 
appear in the definition of our angles, as previously explained. 

Let C be any convex angle in R*, and C its dual; the chain defined by C 
on 5*~' will be called the outer angle belonging to C, and will be denoted by 
Q(C); that is the chain consisting of the cell C7\S*— if C is of dimension n, 
that is if C is of type 0; if C is of type greater than 0, C is of dimension less 
than m, and 2(C) =0. With that definition, we have the following theorem: 


THEOREM III. In a subdivision D of a convex angle C of dimension m, let Cy 
(OSrsm;1SASN,) be the inner angles; let Q(C) and Q(C)) be the outer angles 
belonging to C and to Cy, respectively. Then: 


(- = (- 1)"-2€. 


r=0 Awl 


We may assume that D is nondegenerate, as otherwise C and all Cy are 
of type greater than 0 and 2(C)=(C;)=0. Let TI be any (m—1)-cell in a 
subdivision of 5*-! in which Q(C) and all 2(C;) are sums of cells; put e=1 or 0 
according as I is contained in Q(C) or not, and e,,,=1 or 0 according as I is 
contained in 2(C{) or not. We have to prove that }->,,,(—1)"-e,,,=(—1)"-e. 

Take first the case e=1. Then I is contained in the dual C of C, and there- 
fore in the duals of all Cx, which all contain C; all the e,,, are equal to 1, and 
our formula reduces to >.-(—1)"- N/ =(—1)”, which is contained in Lemma 2. 

Take now the case e=0. Let b be an inner point of I; call EZ the angle, or 
closed half-space, determined by (6, x) 20 in R*; call J the subset of E defined 
by (b, x) >0. As 6 is not in C, C has a point in J, and therefore (by Lemma 1) 
D=CNE is an angle of dimension m. Similarly, Cy has a point in J if, and 
only if, ¢,,,=0, and then D, = C\(E is a nondegenerate angle of dimension r. 
Every inner point of D is an inner point of C, therefore an inner point of a Cy; 
it must be, then, an inner point of the corresponding Dj, which shows that 
those Di which correspond to values of r, \ such that e,,=0 are the inner 
angles of a subdivision of D; if M; is the number of such Dj for a given 
dimension r, we have therefore, by Lemma 2, 1)"- =(—1)*; hence, 
in that case, =>.r(—1)"- (N/ — M/) =0, which completes the 
proof. 

Theorem III applies to angles of any dimension and type, and in particu- 
lar to degenerate angles. Whenever C is of type greater than 0, 2(C) is 0. 

We observe here that it is merely in order to simplify our exposition that 
we do not deal with re-entrant, that is, non-convex angles; all our results 
apply automatically to such angles, provided Theorem III is used to define 
the corresponding outer angles; we mean that, D being a subdivision of a non- 
convex angle C into convex angles, 2(C) should be defined by the formula in 
Theorem III; Theorem III may then be used to show that this 2(C) does not 


| 


1943] THE GAUSS-BONNET THEOREM 109 


depend upon the choice of D. Even self-overlapping angles could be treated 
in the same way. 

3. Dual angles in Euclidean space. In view of the use to be made of dual 
angles in §§5-7, we add some remarks on the few circumstances which are 
peculiar to the case of Euclidean spaces. We therefore assume that a positive- 
definite quadratic form > »g,»-x*x’, with constant coefficients g,», is given in 
the space R* of §2. As usual, this is used primarily in order to identify R* 
with the dual space R* by means of the formulae y,=)_,g,»-x’, or, calling 
||g""|| the inverse matrix to ||z,.||, x*=)0,g"”.y,; the two spaces being thus 
identified, x* and y, are called the contravariant and the covariant com- 
- ponents, respectively, of the vector which they define; they are the same 
when, and only when, cartesian coordinates are chosen in R*. We have 
(x, x’) =) 8p - xx"; two vectors are called orthogonal if (x, x’) =0. The unit- 
sphere S*-!=S*-! in R* is then naturally taken to be the set of all unit- 
vectors defined by (x, x) =1; only in cartesian coordinates does it appear as 
>. .(x*)?=1. The dual manifold V*~* to a given linear manifold V* is now the 
orthogonal or normal manifold to V’, consisting of all vectors which are or- 
thogonal to every vector in V’. 

Every linear manifold V* may now itself be regarded as a Euclidean space, 
and identified with the dual space; if C is an angle in V*, we may therefore 
consider its dual taken within V*, which will be an angle in V’, as well as its 
dual in R*. When applied to an angle of given dimension and type, this leads 
to the following results, which we state in the notation best suited to later 
applications. 

Let R” be a Euclidean space; let A, be an angle of dimension m and type r 
in R*, contained in the linear manifold 7* and containing the linear mani- 
fold put g= N—n, call N* the orthogonal manifold to T*, and the 
orthogonal manifold to JT’ within J*: the orthogonal manifold to T* in RY” 
is then the direct sum N*-*'+N‘, consisting of all sums of a vector in N*-* 
and a vector in N*. 

If we take cartesian coordinates w* (1 Sa WN) so that the r first basis- 
vectors are in J’, the n—r next ones in N*~’, and the g last ones in N*, the 
angle A, can be defined by w*t?=0 (1 pq) and by a finite number of in- 
equalities of the form >>"='b,-w't*20. It is then readily seen that the dual 
A*-* of A, in R*, and its dual A*-* taken within 7", are related by the for- 
mula: which means that consists of all sums of a vec- 
tor in A*~* and a vector in N*; in other words, a vector is in A¥-* if and only 
if its orthogonal projection on T* belongs to A*-*. Moreover, A*~’ is the same 
as the dual, taken within N*~’, of the trace of A, on N*~’, that trace being an 
angle of dimension »—r and of type 0. In this way, questions concerning the 
dual of an angle of arbitrary dimension and type may be reduced to similar 
questions concerning the dual of an angle of type 0 and of the highest dimen- 
sion in a suitable space. The same, of course, could be done in an affine space 
if desired. 


110 C. B. ALLENDOERFER AND ANDRE WEIL [January 


4. Convex cells and their tubes(’). We consider an affine space R”, and 
its dual R*. The linear manifolds which we shall now introduce do not neces- 
sarily contain O. 

A convex cell in R” is a compact set of points defined by a finite number 
of inequalities (,, z) 2d,. It is said to be of dimension 1 if is the dimension 
of the smallest linear manifold W* containing it; it is then known to be 
homeomorphic to an n-dimensional element. K* being an n-dimensional cell, 
contained in the linear manifold W*, an inner point of K* is a point, a neigh- 
borhood of which in W* is contained in K*. Inner points of K* form an open 
set in W"; the closure of that set is K", and its complement in K"*, that is, the 
boundary of K", consists of a finite number of convex cells Kj, where r takes - 
all values greater than or equal 0 and less than or equal to »—1. We shall 
count K" as one of the Kj; with that convention, the Kj, for OSr<n, forma 
combinatorial complex of dimension n. Kj is a convex cell in a linear manifold 
W,; the inner points of Kj are those which belong to no Kj, for s<r. Every 
point in K* is an inner point of one Kj and one only; and, if an inner point 
of Kj belongs to Kj, then K, CK;. 

z being a point in K*, the points x=£-(z’—z), where z’ describes K* 
and &£ takes all values greater than or equal to 0, form a convex angle, which 
can be defined by some of the inequalities (6,, x) 20; this will be called the 
angle of K" at 2; conversely, if x is any point in that angle, s+-e-x will be in 
K" for all sufficiently small «20. The angle of K" at z is of dimension n, and 
contained in the linear manifold V*, the parallel manifold to W* through O; 
if z is an inner point of Kj, the angle of K* at z is of type r and contains Vj, 
the parallel manifold to Wy through O; it depends only upon r and X, and will 
be denoted by C,,,; its dual C¥~’ is of dimension N —r and type N—n. 


Lemma 4. Let v be a vector in R®; v is in CN-" if, and only tf, there is a real 
number e such that (v, 2) =e on Kj and (v, 2) Se on K"; v is an inner point of 
CY if, and only if, there is an e such that (v, 2) =e on Kj and (v, 2) <e for all z 
in K" except those in Ky. 


As to the first point, let v be in CX": let zo be in Kj; put ¢9=(v, 20). For 
every z in K", z—Zo is in C,,,, therefore (v, z—zo) $0, hence (v, z) Seo; there- 
fore é9 is the least upper bound of (v, zs) on K* and cannot depend upon the 
choice of zo in Kj, so that (v, z) =e» for all z in Kj; this proves the first point. 
Conversely, suppose that (v, z) =e for one z in Kj, and that (v, 2’) Se for all z’ 


(7) Tubes of convex bodies and of surfaces are of course nothing new, being closely related 
to the familiar topic of parallel curves and surfaces. On some aspects of this topic which belong 
to elementary geometry, the reader may consult W. Blaschke, Vorlesungen iber Integralge- 
ometrie. 11, Hamburger Mathematische Einzelschrift, no. 22, Teubner, Leipzig and Berlin, 
1937, in particular §37; on p. 93 of that booklet, he will find careful drawings of the tube of a 
triangle in the plane, and of a tetrahedron in 3-space. The volume of the tube of a closed mani- 
fold was recently calculated by H. Weyl, On the volume of tubes, Amer. J. Math. vol. 61 (1939) 
p. 461; part of H. Weyl’s calculations will be used in our §6. 


1943] THE GAUSS-BONNET THEOREM 111 


in K*; we have (v, z’—z) $0 for all z’ in K*; this gives (v, x) <0 for all x in 
C,,,, and so v is in C’~", The second part can now easily be deduced from 
Lemma 3. 

K" being compact, every linear form (v, z) has on K" a least upper bound e; 
the intersection of K* with the linear manifold (v, z) =e is then one of the cells 
Ki. This fact, combined with Lemma 4, shows that the angles C¥~’ constitute 
a subdivision of R*, according to our definition in §2. The angle C, of K* 
at every inner point is degenerate, and reduces to V*; its dual C¥—* is there- 
fore the dual manifold V¥-* to V*; the subdivision of R* which consists of 
the C~’ is therefore nondegenerate if N =n, and degenerate if N>n. We leave 
it as an exercise to the reader to verify that, conversely, every subdivision 
of R* into convex angles can be thus derived from a convex cell, or rather 
from a class of convex cells, in R¥. We observe incidentally that Theorem III 
of §2 could now be applied; taking N=n, which is the only significant case, 
the Q(Cx-’) are now the spherical cells determined by the C,,, on the unit- 
sphere S*—'. In particular, assuming that we are in a Euclidean space, and 
calling u(C,,,) the spherical measure of the cell determined by C,,, (which 
is nothing else than the measure of the “solid angle” C,,), we find that 
> .o>_a(—1)"-w(C,,,) =0; this is the main result on Euclidean polyhedra in 
H. Poincaré’s paper(*) on polyhedra in spaces of constant curvature; his re- 
sults on spherical polyhedra could also be derived by similar methods. 

Now we take R” as a Euclidean space, distance and scalar product being 
defined by means of a fundamental quadratic form (y, y); and we conse- 
quently identify R* with R”, as we did in §3. Let y be any point in R; its 
set-theoretical distance 5(y) to K* is a continuous function of y. Let z=2z(y) 
be the nearest point to y in K*; as K* is a compact convex set, 2(y) is uniquely 
defined and depends continuously upon y; the vector v=y—2z(y), which is of 
length 5(y), therefore also depends continuously upon y. That being so, we 
have (y—2’, y—z’) 2(v, v) for every z’ in K*. Let x be a vector in the angle of 
K* at 2;2’=z+e-x isin for sufficiently small «20, and then y—2’ =v—e-x, 
so that, for small ¢, we have (v—e-x, v—€-x)2(v, v). That implies that 
(v, x) $0. If, therefore, z is an inner point of Kj, so that the angle at z is C,,, 
v is in CY’, Conversely, let v be in CY’, and z be an inner point of Kj; as 
z’—zis in C,,, for every 2’ in K*, the same calculation will show that z is the 
point in K* nearest to z+. . 

We now consider the set ©” of all points y in R¥ whose distance 5(y) to 
K* is at most 1, and we call it the Euclidean tube of K* in R¥. As ©” is a 
compact convex set and contains an open set in R”, it is homeomorphic to 
an N-dimensional closed element. On the other hand, let B” be the set of all 
vectors v in R¥ such that (v, v) $1, the boundary of which is the unit-sphere 
S*-1; let T(K*) be the subset of the direct product K* XB”, consisting of all 

(*) H. Poincaré, Sur la généralisation d'un théoréme élémentaire de géométrie, C. R. Acad. 
Sci. Paris vol. 140 (1905) p. 113. 


112 C. B. ALLENDOERFER AND ANDRE WEIL [January 


elements (z, v) of that product such that, if z is an inner point of Kj, v is in 
C¥-". We have shown that the relation y =s-+-» defines a one-to-one bicontin- 
uous correspondence between ©” and T(K"); the latter, therefore, is a closed 
subset of K"XB*”, homeomorphic to B¥; by means of the correspondence 
defined by y =z+, we identify once for all 9% and T(K*). Calling (z(y), v(y)) 
the point in T(K*") which is thus identified with y in 0”, we see that the 
boundary of ®* consists of all points y for which v(y) is on S¥—!; in other 
words, the mapping y—v(y) of the tube into B¥ maps the boundary into the 
boundary. As every v is in at least one C’~’, the image of the tube by the map- 
ping v(y) covers the whole of B¥. If we consider a vertex zo=K? of K*, and 
take for vp an inner point of the angle C%, all vectors v sufficiently near to 
in R¥ belong to C¥ and to no other angle C~’, as C” is an angle of the highest 
dimension in the subdivision of R% which consists of the C¥-’. Every such 
vector v, therefore, is the image, by v(y), of the point y=z9+2 and of no other 
point of @”. This shows that in the neighborhood of such a vp the mapping 
v(y) has the local degree +1, and so, as it maps boundary into boundary, it 
has the degree +1 everywhere, provided of course that both 0” and B* are 
given the orientation induced by that of R¥. 

5. Curved cells and their tubes. From now onwards, K* will be a convex 
cell in an affine space R"; the object of §§5—6 will be to discuss differential- 
geometric properties of K" corresponding to the Riemannian structure de- 
termined on it by a certajn choice of a ds?. We write the coordinates in K" as 
(1 Sun); and we choose coordinates (1 on each one of the cells 
for instance, we may choose the from among the 
taking care to select such as are independent on Kj, and this may be under- 
stood for definiteness, although playing no part in the sequel. In what follows, 
N=n+¢q is any integer greater than or equal to ; and we make for §§5-—6 the 
following conventions about the ranges of the various letters which will occur 
as indices: 

isaSN; 1Sijsr; 1S ps8q; 


We shall consider real-valued functions ¢(z), defined on K*. As usual, 
such a function is said to be of class C' (on XK") if it has a differential 
do =)_,,(z)-dz* with coefficients ¢,(z) =0¢/dz" which are continuous func- 
tions over K"; class C™ is defined inductively, ¢ being of class C™ if it is of 
class C' and the 0¢/0z* are of class C™—'. 

Local properties of K” as a differentiable space are those which remain 
invariant under a differentiable change of local coordinates with jacobian 
different from 0. Such properties include the intrinsic definition of the tangent 
affine space 7"(z) and of the angle of K” at the point z as follows. T"(z) is the 
vector-space consisting of all differentiations X@, defined over the set of all 
functions @ of class C' in a neighborhood of z, which can be expressed as 
Xo=lim [6(z’’) —¢(z’)], where 2’ and z’’ both tend to z within K*, and 


* 1943] THE GAUSS-BONNET THEOREM 113 


tends to + ©. The vectors X,@ =0¢/0z" form a basis for T(z), so that every 
point of T(z) can be written as XP =)>x"-0b/dz"; we shall denote by x the 
point of 7*(z) which, for that basis, has the components x“. As in §2, the dual 
space T*(z) to T(z) is the space of the linear forms (y, x) =a X"5 the 
elements x of T(z) and y of 7"(z) are known in tensor-calculus as contra- 
variant and covariant vectors, respectively. 

The angle of K* at z is the subset of T(z), consisting of all those differen- 
tiations X¢ which can be expressed as X¢=lim £- [(z’) —(z)], where 2’ 
tends to z within K" and & tends to + ~ ; by the correspondence which maps 
every point x =(x*) in T"(z) onto the point with coordinates x“ in the affine 
space R* containing K", that angle is transformed into the angle of K" at z 
as defined in §4, the difference between the two being of course that the latter 
was defined in affine space whereas the definition of the former refers to K* 
as a differentiable space. The relationship between them implies that, if 
z=2({) is an inner point of Kj, having in Kj the coordinates ¢‘, the angle at z 
is of dimension m and type r; we then denote it by A,,,({); the linear manifold 
T;(¢) contained in A,,,(¢) will be identified as usual with the tangent affine 
space to Ki by the formulae 0¢/0¢'=)_,0¢/dz"-dz"/d¢*; it is spanned by the r 
linearly independent vectors (dz*/d¢*). We denote by Ax~*(¢) the dual angle 
to A,,,(¢), which is of dimension —r and type 0; it is contained in the linear 
manifold Nx~’(¢) of all vectors y=(y,) such that (y, x) =0 for x in 7;(¢). 

We now consider mappings f(z) = ({*(z)) of K" into an affine space R’ ; f(z) 
is said to be of class C™ if each f*(z) is of class C™. A mapping f(z) = (f*(z)) 
will be said to define an n-dimensional curved cell (K", f) if it is of class C! 

.and the vectors (0f*/dz*) in are linearly independent for every z in K”. 
As usual, the linear manifold spanned by the vectors (0f*/0z") in R¥ is identi- 
fied with the tangent affine space T"(z) to K" at z by identifying point x = (x*) 
in with the vector in T(z) thus appears as im- 
bedded in R¥. The manifold 7;({), as a submanifold of T(z) when z=2(¢) 
is in Kx, is thus also imbedded in R*, and as such is spanned by the vectors 
= In the same imbedding, the angle A,,,(¢) ap- 
pears as an angle of dimension m and type r in R*, contained in T(z) and 
containing 7;({). As the vectors (0f*/0{*) are independent, the mapping f, 
when restricted to Kj, defines a curved cell (Kj, f) of dimension r in R¥. 

We now take R* as a Euclidean space; cartesian coordinates being chosen 
for convenience, the distance is defined by the form (w, w) =)..(w*)?. The 
quadratic differential form (df, df) =)-2(df*)? =). is nondegener- 
ate, under the assumptions made on f, and defines a Riemannian geometry 
on K*; this amounts to making the tangent affine space 7"(z) into a Eu- 
clidean space, either by means of its imbedding in R* or intrinsically by 
(x, x) =) oS. xx"; the g,, are functions of z alone. We may then identify 
T*(z) with its dual T(z), as in §3, by the correspondence y, =) »g.°x’; calling, 
as usual, ||g*|| the inverse matrix to ||g,,||, we have then x*=)_,g”-y,; the 


114 C. B. ALLENDOERFER AND ANDRE WEIL {January ~ 


y, are called the covariant components of the tangent vector x, and the quan- 
tities are its components in 

The Riemannian geometry thus defined in K* induces on each Kj a Rie- 
mannian geometry, with the fundamental form (df, df) =)>:vi;-dt ‘dt?, where 
Vii =) The determinants of the matrices | Lull, are 
denoted by g and ¥, respectively; we have g>0, y>0. 

We now call N*(z) the orthogonal linear manifold to T*(z) in R*, that is, 
the normal linear manifold to the cell at z; and, taking z=2(¢) to be an inner 
point of Kj, we apply to A,,,({) the results of §3. Identifying, as we now do, 
T(z) with T(z), the dual linear manifold to within T*(z) 
appears as the orthogonal manifold to 7;({) within T*(z), that is, the normal 
manifold to the subcell (Kj, f); the orthogonal manifold to 7;({) within R¥ 
is then Nz~"(¢)+N*(sz). The dual angle Ay~’(¢) to A,,,(¢) within T(z) is now 
an angle of dimension »—r and type 0 in the normal manifold Nx~’({); it 
is the same as the dual, taken within Nx~"(¢), of the trace of A,,,(¢) on 
Nz-’(¢). Finally, the dual A¥~’(¢) of A,,,(¢) within is an angle of dimen- 
sion N—r and type gq, and can be written as A*~’(¢) = + N%(z); this 
means that a vector w is in A¥~’(¢) if, and only if, its orthogonal projection 
on 7*(z) isin Ax 

It should be observed that the dual angle A;'~’(f) to A,,a({), as originally 
defined in the dual affine space 7*(z) to T*(z), depends only upon K* re- 
garded as a differentiable space, irrespective of the choice of f or of a Rieman- 
nian structure; and we write that a vector y in T*(z), given by its compo- 
nents y,, isin Ax~’(¢) by writing that }>,y,-X(s*) $0 for every differentiation 
X contained in the angle of K* at 2(f). On the other hand, the angles in R¥ ° 
and in 7*(z) which we have identified with A;~’(¢), and which, for short, we 
also denote by the same symbol, depend, the former upon the choice of the 
mapping f, the latter merely upon the g,,. 

We now define the tube T(K", f) of the curved cell (K", f) as the subset 
of K"XB* which consists of all points (z, w) of that product such that, if z 
is an inner point of Ky, and s=2(¢), then w is in A¥~’(¢). Whenever f is an 
affine mapping, that is, when the f* are linear functions, the tube T(K”", f) is 
the same as the tube T(L") of the convex cell L*=f(K*), as defined in §4. 
Furthermore, if (K*, f) is an arbitrary cell, the set @;(K*, f) of all points at a 
set-theoretical distance 5 from f(K") in R* is easily shown to be the same as 
the set of all points y*=f*(z)-+5-w* when (z, w) describes T(K", f), and it 
seems very likely that these relations define a one-to-one correspondence be- 
tween 0;(K", f) and T(K*, f) provided f itself is a one-to-one mapping and 
provided 4 is sufficiently small. 

The central result of this paper is now implicit in the following lemma, 
which will turn out to contain the Gauss-Bonnet formula for curved cells: 


1943] THE GAUSS-BONNET THEOREM 115 


Lemma 5. The mapping (2, w)—»w of the tube T(K*, f) into B® has every- 
where the degree 1. 


The lemma has been proved in §4 for the Euclidean tube of a convex cell. 
The general case will be reduced to that special case by continuous deforma- 
tion. 

As a preliminary step, we consider the topological space, each point of 
which consists of a point z in K* and a set of g mutually orthogonal unit- 
vectors in N*(z). This is a fibre-space over K*, the fibre being homeomorphic 
to the group of all orthogonal metrices of order g; therefore, by Feldbau’s 
theorem(®), it is the direct product of K* with the fibre; that implies that it 
is possible to choose the g vectors m,(z) as continuous functions of z in K* so 
as to satisfy the above conditions for every z. We call n¢(z) the components 
of 2,(z) in R*. 

Let now z=2({) be an inner point of Kj, and wa point in R¥; call x, u the 
orthogonal projections of w on T(z) and N*(z), respectively; call x, the co- 
variant components of x, u* the components of u with respect to the basis- 
vectors n,(z), so that we have 


a a 


We have, then, (w, w) =(x, x)+(u, u) =>, (u*)?; and (z, w) is 
in the tube T(K’, f) if and only if x is in Ax~"(¢) and (w, w) <1. 


All that applies to the special case when f(z) is replaced by >>,63-2*, 
that is, by z* for a=yu<n and by 0 for a>n, in which case the tube becomes 
the Euclidean tube @” of a convex cell; therefore, z=2({) being again an inner 
point of K;, (z, v) will be in @” if and only if the vector in 7*(z) with the com- 
ponents (1 <u Xn) is in and .(v*)*S1. Writing, therefore, 


ue = (1SpZq) 


these formulae, together with the formulae above, define a homeomorphic 
correspondence between the points (z, v) of the Euclidean tube ©” and the 
points (z, w) of T(K*, f). 

We now assume coordinates to be such that 0 is in K"; calling r a parame- 
ter taking the values 0<7 <1, the point r-z=(r-z*) is in if z is in K*. 
For every tr >0, we consider the curved cell (K*, f) defined by f(s) =f(r-2)/r. 
Putting df*/ds*=f2(s), we have, for the cell (K*, f), Of«/dz" =f2(r-2), 
2), =g*"(7-z), and we may take as normal vectors to that 


(*) J. Feldbau, Sur la classification des espaces fibrés, C. R. Acad. Sci. Paris vol. 208 (1939) 
p. 1621. 


116 C. B. ALLENDOERFER AND ANDRE WEIL [January 


cell %,(2) =m,(r-2). That being so, the above formulae for the transformation 
of @¥ into T(K*, f) show that this transformation depends continuously 
upon 7, and therefore that the tube T(K", f) is deformed continuously when r 
varies. When 7 tends to 0, these formulae tend to the corresponding formulae 
for the cell (K*, fo) defined by f¢(z) =)>>,f2(0) -3* when the normal vectors 
for (K", fo) are taken as n}(z) =m,(0); fo being affine, (K*, fo) is a convex cell, 
to which the results of §4 apply. 

Lemma 5 follows easily. For the image of our tube in B¥ by the mapping 
(z, w)—>w is deformed continuously when the tube is so deformed; the image 
of its boundary remains in S¥—'. The degree is therefore constant during the 
deformation; as it is +1 for r=0, it is +1 for 7=1, which was to be proved. 

6. The Gauss-Bonnet formula for imbedded cells. We put 


dw = dw'-du* -- - 


A special consequence of Lemma 5 in §5 is that the integral of dw over the 
tube T(K*, f) is equal to the integral of the same differential form over 
BY, that is, to the volume v(B”) of the interior of the unit-sphere in RY. 
Therefore, calling J,,, the integral of dw/v(B”) over the set of those points 
(z, w) in the tube for which z is an inner point of Kj, we have 


D = 1. 
r=0 A 
This becomes the Gauss-Bonnet formula when the J,,, are expressed intrinsi- 
cally in terms of the Riemannian geometry on K*. The calculation depends 
upon a lemma which immediately follows from a formula proved in s+recent 
paper by H. Weyl(?*). 
Lemma 6. Let ||A;,||, || Z|] be ¢-+1 matrices of order r; and write 
Piste = — L 


Ligi,)- 


Then the integral of -du!-du® - - - dut, taken over the volume 
>>, (u*)? Sc?, is equal to: 


r/2) 
v( Be) Reg 


f=0 
p> 2%. — 2f)! 


where kos=1/(q+2)(q+4) - (¢+2f), and the conventions about summation 
are as explained in §1. 


(#*) Loc. cit., Footnote 7, p. 470. Similar calculations may also be found in W. Killing: 
Die nicht-euklidischen Raumformen in analytischer Behandlung, Teubner, Leipzig, 1885, p. 255. 


= 


1943] THE GAUSS-BONNET THEOREM 117 


Our calcrlation of J,,, will be valid under the assumption that the map- 
ping f(z) is of class C*; in order, however, to be able to introduce the Rie- 
mannian curvature tensor, we assume from now onwards that f(z) is of class 
C*. In the ccurse of the calculation of J,,,, we simplify notations by omitting 
the subscript X. 

We may calculate IJ, by cutting up the cell K* into small subsets, and 
cutting up J, correspondingly; we take those subsets to be cells of a subdivi- 
sion of K’, and so small that it is possible to define, on each of them, g vectors 
n,({) of class C! and m—r vectors v,({), also of class C!, satisfying the follow- 
ing conditions: the ,({) are an orthonormal basis for the normal linear mani- 
fold N*(z) at 2({); the ».({) are an orthonormal basis for the normal manifold 
to in T*(z) ; and, calling n¢ the components of those vectors 
in the matrix A= has a determinant greater than 0. The 
latter determinant can then be calculated by observing that, if AT is the trans- 
pose of A and I'=||-7;;||, we have 


and therefore (|A|)?=+, so that |A| =+7"?. 

z=2({) belonging to one of our subsets in K’, let (z, w) be in the tube 
T(K", f). Let x, u be the orthogonal projections of w on T*(z) and N*z), 
respectively ; x is in A*-"({) CN*-*(f), so that x can be written as ).v<(f) -t?; 
let u* be the components of u with respect to the basis m,. We have: 


w= + Dom 


As these are functions of {, t, u of class C', we can express dw in terms of 
df - - - dt=dt'-dt? - - - dt™-", du=du'-du?® - - - dut: 


ave 


The determinant is best caiculated by multiplying its matrix to the left by A’, 
the determinant of which has been found to be +7?; that gives a matrix of 
the form 


o a a 


M 0 
* 44)’ 
which has the determinant | M|. That gives: 
dw = + Liu oy dt-dus 


if we put 


wt 


C. B. ALLENDOERFER AND ANDRE WEIL 

af” an, 
a OF 
In the integration of this, orientation has to be considered. Call ¢, « the points 
with the coordinates (¢*), (u*), respectively, in two auxiliary spaces P*~*, P*; 
we also consider the point with the coordinates {*, ¢’, u* in the space 
The formulae z=2({), +> U? define the 
portion of the tube now under consideration as a homeomorphic image of class 
C' of the subset of P” defined as follows: ¢ is in a given subset of K*;¢ is such 
that is in A*-*(f); and As A*-* depends 
continuously upon ¢, that set is the closure of an open set in P”. Call now 
01, 02, 03 any orientations of K*’, P*-*, P*, respectively; the factors in the 
product P” = K*xX P"-*X P* being ordered as written, 01, 02, 03 define an ori- 
entation 0; X02: X03 in P”, and therefore a local orientation, also denoted by 
0; X 02 X03, in the part of the tube which we are discussing. On the other hand, 
the mappings of P*-r, P* onto N*-*(¢), N%(sz) 
transform 02, 03 into orientations, also denoted by 02, 03, of N*-*(g), Nz). 
We now choose for 0:, 02, 03 the natural orientations of K’, P*-*, P*, respec- 
tively, defined by the coordinates ¢‘, ¢”, u* taken in each case in their natural 
order. The condition on the sign of |A| which served to define the v,, n, 
amounts to saying that the orientations 0}, 02, 03 of {(K*), N*-"(§), N%(z) at 
z=2({) define, when taken in that order, the natural orientation of R*. That 
being so, we now show that the local orientation of the tube defined as 
0; X02 X03 coincides with that orientation 2 of the tube as a whole which 
ensures the validity of Lemma 5. That is easily verified for the tube of a 
convex cell, by identifying it with a subset 8” of R* as in §4. In the general 
case we use the deformation of our tube into that of a convex cell, by means 
of which we proved Lemma 5; for, in such a deformation, the manifolds 
N*-'(¢), vary continuously, and therefore we have 0; X02 during 
the whole deformation, since this is true for one value r = 0 of the parameter. 

We can now proceed to integrate dw by first integrating with respect to 
u while ¢ and ¢ are kept constant; u is to be given all values such that 
(u*)? <1—)-.(t*)*. We first observe that, by differentiating the relations 
v =0, >> =0 which express that m, are normal vec- 
tors to T’({), we get the following expressions for A;;, Lj: 


Ay = — : 


“4.2 


where x*=)_,v3-t* are the components of the vector x in R”; these are the 
negatives of coefficients of the so-called “second fundamental forms” of f(K’*) 
in R%. The A;; are thus seen not to depend upon the choice of the basis-vectors 
v, in N**(¢), but only upon the vector x; as such, we shall now call them 
Ai;(x); it is known that they are intrinsic quantities with respect to the 


« 


1943] THE GAUSS-BONNET THEOREM 119 


Riemannian geometry in K*, and can be expressed by formula (3) in §1, if we 
denote by x, the covariant components of x; we have x,=).0f*/ds"-x*. 
The application of Lemma 6 further leads to the introduction of the quan- 
tities 


which also are known to be intrinsic quantities, their expression in terms of 
the curvature tensor in K* being given by formula (2) in §1. We now distin- 
guish two cases: 

(a) If r=n, the {‘ in the foregoing calculation should be read as 2“, and y 
as g; there are no vz, no #’, no A;;. Integrating dw first with respect to u, we 
get, by straightforward application of Lemma 6: 


I, = 


where Y¥(z) is defined by formula (1) in §1. 
(b) If r<m, the integration of dw with respect to u by Lemma 6 gives, if 
we define the functions ®,,,(¢, x) by formula (4) in §1 


[r/2} (@/2)+F 


f=0 


where dv({) =y'/?- df is the intrinsic volume-element in K*. We may push the 
integration one step further, by writing x=a-&, x,=a-&,, t*’=a-1r’, where 
L-(r*)?#=1 and 0Sa<X1; é, are thus the covariant components of vector &, 
T* its components with respect to the basis »,, and £ is on the unit-sphere in 
N*-*({); & describes a spherical cell I'(¢), the trace on that sphere of A*-'(¢); 
I'(¢) is the outer angle in N*~*(f) of the trace of A,({) on N*~*(§). Calling dé 
the area-element or spherical measure on that sphere, we have dt =a*—!-da-dé. 
We can now carry out the integration in a, which involves only the elementary - 
integral [(1—a?)‘e/2+/.q"—-%-1.da, and thus find 


where ¥W is defined by formula (5) in §1. This, combined with our earlier re- 
sult >>,,,J,,=1, completes the proof of Theorem II for K*, with the Rie- 
mannian structure defined by the g,,, if we observe that the inner character- 
istic of K* is x’(K*) =(—1)*. 

It may be observed that, for r=n—1, the outer angle ['({) is reduced to 
a point, namely, the unit-vector £ on the outer normal to K*~' in the tangent 
space to K*; the integral in d should then be understood to mean the value 
of the integrand at that point. Similarly, for r=0, K* is reduced to a point, 


120 C. B. ALLENDOERFER AND ANDRE WEIL [January 


and the integral in dv({) should be understood correspondingly. In the latter 
case, J) contains only one term, corresponding to f=0, which is simply the 
spherical measure of the outer angle I'({), measured with the area of the 
sphere taken as the unit. In the case of a Euclidean convex cell, the terms Jo 
in our formula are the only ones which do not reduce to 0. 

As a preparation to §7, we furthermore have to prove some identities 
concerning the application of the above results to cells of lower dimension 
imbedded in K*. Let L” be a convex cell, ¢ a one-to-one mapping of class C* 
of L? into K*, such that (L?, ¢) is a curved cell; we assume that OS pSn—1. 
For simplicity of notations, we identify L” with its image in K* by ¢, and 
call (L?, f) the curved cell which according to earlier conventions should be 
written as (L”, h) where h is the product of the two mappings f, ¢. L’ denoting 
either L” or, for OSrSp—1, any one of the boundary cells of L®, we choose 
coordinates ¢‘ on L’, and again identify L’ with its image in K*. The part of 
the tube of (L”, f) which corresponds to L* then consists of all points (z, w) 
in K*XB* for which z=2(f) is an inner point of L*, (w, w) $1, and w is in 
the dual in R* to the angle of L” at this point; the latter angle is in the tan- 
gent linear manifold to L”, which as before should be considered as imbedded 
in the tangent linear manifold T*(z) to K* at the same point, and is of dimen- 
sion p and type r; we denote it by B,(¢). Let N*~"(¢) be the normal linear 
manifold to L’ at ¢; the dual to B,(¢) in R” consists of all vectors w whose 
projection x on T*(z) belongs to the dual to B,(¢) in T(z), which is contained 
in N*-"(¢). Let L” be an open subset of LZ’, so small that we may define on 
it g vectors m, and m—r vectors v, precisely as before (K" being replaced 
by Z*). The calculation and integration of dw for that part of the tube con- 
sisting of all points (z, w) with z in L’* now proceeds, without any change, 
just as before; the case r=m does not arise, as rSp<n—1; calling I the in- 
tegral of dw/v(B”) over that part of the tube, we have, therefore: 


ff doe) f 29, 
L r(L?,f) 


where we now denote by I'(L?, £) the trace on the unit-sphere in N*~*({) of 
the dual to B,(¢) in T*(z). On the other hand, we could have applied our 
method to L? itself, considered intrinsically and not as imbedded in K"; this, 
for r= p, would have given us 


if we denote by Wo(f{) the invariant built up in L? just as V(z) was built up 
in K*. As this is true for any sufficiently small L’?, we get, for every inner 
point ¢ of L?, the identity 


THE GAUSS-BONNET THEOREM 


H(t) = f 


where I'(L?, {) being as above defined, is easily seen to be the full sphere in 
N*-*{{). Similarly, for OS r Sp—1, we denote by fo| the quantity, 
similar to V, which is built up in L? from the Riemannian structure defined 
on L? by its imbedding in K*, from the imbedded submanifold L’, and from 
a unit-vector £ normal to L’ in the tangent linear manifold T?(¢) to L?; and 
calling T'o(Z”, ¢) the trace on the unit-sphere of the dual to B,(¢) in T?(¢), we 
get as before, ¢ being any inner point of L’: 


7 = &| L’). 


The identities (6), (7) contain only quantities which are intrinsic in K* for 
the Riemannian structure defined in K* by the metric tensor g,,. They have 
just been proved for the case in which the g,, are defined by a mapping f 
of K* into R* ; however, they depend only upon the g,, and their derivatives 
of the first and second order at point 2(¢). It is easy to define a small cell K’* 
containing a neighborhood of point 2({) in K*, and a mapping f’ of K’* into 
a Euclidean R¥’, so that (K’*, f’) is a curved cell and that the gj, defined by 
f’ over K’* have, together with their derivatives of first and second order, 
prescribed values at 2({); in fact, we may do that by taking any analytic g;, 
satisfying the latter conditions, and apply Cartan’s theorem(‘), but there are 
of course more elementary methods of obtaining the same result. As (6), (7) 
are purely local properties of the Riemannian cell K* and of the imbedded 
L», L’, they are thus shown to hold without any restriction. They could, of 
course, be verified by direct calculation; this would be straightforward but 
cumbersome, and would require another application of Lemma 6. 

7. The Gauss-Bonnet formula for Riemannian polyhedra. We first define 
Riemannian polyhedra as follows. 

Let P* be a compact connected topological space, for which there has been 
given a covering by open subsets 2, and a homeomorphic mapping ¢, of each 
Q, onto an m-dimensional convex angle C, which may be R"; if the ¢, and the 
inverse mappings y, are such that every ¢,|W.(x)] is of class C™ at every 
x€C, such that ¥,(x)EQ,, P* will be called an n-dimensional differentiable 
polyhedron of class C™. As noted before (§2), re-entrant angles would lend 
themselves to similar treatment but are purposely avoided for simplicity’s 
sake. 

By a differentiable cell of class C™, we understand a differentiable poly- 
hedron of class C™ which can be put into a one-to-one correspondence of class 
C™ with a convex cell. 

The beginning of §5 provides a definition for the tangent affine space and 
the angle of a differentiabi- cell at any one of its points; those definitions, 


1943] 121 


122 C. B. ALLENDOERFER AND ANDRE WEIL [January 


being purely local, apply without any change to a differentiable polyhedron. 
If Cis the angle of P* at the point z, z has a neighborhood homeomorphic to C; 
if C is of type r, we say that z is of type r in P*. Points of type m in P* are 
called inner points of P*. Points of type at most r (where 0Sr<n) form a 
closed, and therefore compact, subset of P*, the closure of the set of the points 
of type r; if the latter consists of N, connected components, the former is the 
union of N,, and not of less than N,, differentiable polyhedra P;, of dimension 
r. A point of type r is an inner point of one of the Pj, and of no other Pi; 
if an inner point of is contained in then P, C The Pi, for Osrsn—1, 
will be called the boundary polyhedra of P*. 

By a regular subpolyhedron Q? in P*, we understand the one-to-one 
image of a polyhedron Qj in P*, provided it satisfies the following conditions: 
¢* being local coordinates in Q§ at any point, and z* local coordinates in P* 
at the image of that point, the functions 2*({) which locally define the map- 
ping are of the same class C™ as the polyhedron P*, and the matrix ||dz*/d¢ ‘| 
is of rank p. Each boundary polyhedron P, of P* is a regular subpolyhedron 
of P*. 

We say that a finite set of distinct regular subpolyhedra Q) of P* forms 
a subdivision D of P* if the following conditions are fulfilled: (a) each point 
of P* is an inner point of at least one Q in D; (b) if Q@ and Q{, in D, are such 
that there is an inner point of Q contained in Q), then QC Q. From (b), it 
follows that no two polyhedra in D can have an inner point in common unless 
they coincide. 

P* and its boundary polyhedra Px thus form a subdivision of P*, which 
we call the canonical subdivision. If D is any subdivision of P*, those poly- 
hedra Q% in D which are contained in a given polyhedron Q in D form a sub- 
division of 


Lemma 7. If Q* is a polyhedron in a subdivision D of P*, all inner points 
of Q* have the same type in P*. 


An inner point of Q* obviously has a type at least r in P*; hence the lemma 
is true for r=; we prove it by induction, assuming it to hold for all Q) in D 
with s>r. Let ¢ be an inner point of Q’; call s its type in P*, so that s27; 
¢ is then inner point of somre P,; we need only show that all points of Q’, 
sufficiently near to {, are in P,. That will be the case if all points of P, suffi- 
ciently near to ¢, are in Q’; for then, since P; and Q’ are of class at least C! 
and regular in P*, we must have s =r, and P;, Q” must coincide in a neighbor- 
hood of point ¢. If that is not so, then { must be a limiting point of inner 
points of P, which are not in Q’; as each of the latter points is an inner point 
of a polyhedron in D, and there is only a finite number of such polyhedra, it 
follows that there is a Q‘ in D, such that ¢ is a limiting point of inner points 
of Q', each of which is an inner point of P, and is not in Q’. This implies that 
f¢€Q', and therefore Q”’CQ'; hence ¢>r, as otherwise an inner point of Q” 


1943] THE GAUSS-BONNET THEOREM 123. 


would be inner point of Q', and Q” would be the same as Q‘. By the induction 
assumption, the lemma holds for Q*; as there are inner points of Q‘ which are 
inner points of Px, we have, therefore, Q‘C P;, and so Q*C P; this proves the 
lemma. 

An immediate consequence is that all the polyhedra, in a subdivision D 
of P*, which are contained in a given boundary polyhedron P; of P*, form 
a subdivision of that P{; this can be expressed by saying that every subdivi- 
sion of P* is a refinement of the canonical subdivision. In particular, if a 
polyhedron Q’, in a subdivision D of P*, contains at least one inner point of 
P*, all inner points of Q* are inner points of P*; Q’ is then called an inner 
polyhedron of the subdivision. 


Lemoa 8. D being a subdivision of P*, and z any point of P”, the angles at z 
of those polyhedra in D which contain z form a subdivision of the angle of P* 
at 2; the inner angles in the latter subdivision are the angles of the inner polyhedra 
in D which contain z. 


In the proof of this lemma, we shall denote by A(Q), Q being any regular 
subpolyhedron of P*, the angle of Q at z, if z€Q, and the null-set otherwise. 
Let x be any vector in A(P*), defined by an operator X¢=lim £-[¢(z’) —¢(z) J, 
where 2’ tends to z within P* and £ tends to + © ; as every 2’ is an inner point 
of a Q in D, and there is only a finite number of such Q), we may define x 
by a sequence of 2’, all belonging to one and the same (7; A (Q)) then contains 
x. Let Q” be a polyhedron of the lowest dimension in D, such that xCA(Q’); 
if x were not an inner point of A(Q”), it would be in the angle at z of a bound- 
ary polyhedron Q” of Q*, with s<r. The polyhedra in D which are contained 
in Q’ form a subdivision of Q’, and so, by Lemma 7, those which are contained 
in Q”* form a subdivision of Q’*; x would therefore be in the angle at z of one 
of the latter polyhedra, which would be of dimension at most s, in contradic- 
tion with the definition of Q’. This shows that x is an inner vector of A(Q’). 
Suppose, that, at the same time, x is an inner vector of A(P*); and let x be 
defined by X¢=lim [¢(z’) — G(s) ] where the z’ are in all z’, sufficiently 
near to z, must be inner points of P* (otherwise x would not be an inner. point 
of A(P*)), and so Q” must be an inner polyhedron of the subdivision D. On 
the other hand, if x is not an inner point of A(P"*), it must be in the angle at z 
of a boundary polyhedron P; of P*; since those polyhedra of D which are 
contained in P;, form a subdivision of P;, it follows, as above, that x is then an 
inner point of an angle A(Q*), where Q* is a polyhedron in D and is con- 
tained in Py. 

The proof of the lemma will now be complete if we show that, whenever Q, 
and Q% belong to D and there is an inner point of A(Q,) contained in A(Q), 
Q, itself is contained in Q). Using induction, we may, in doing this, assume 
that the lemma is true for all subdivisions of polyhedra of dimension less 
than n (the lemma is obviously true when P* has the dimension 1). The ques- 


124 C. B. ALLENDOERFER AND ANDRE WEIL [January 


tion being purely local, we need consider only a small neighborhood of z in P*, 
which we may identify with a convex angle in R*; by the distance of two 
points in that neighborhood, we understand the Euclidean distance as meas- 
ured in R*. Let Q* be a polyhedron in D, such that z€Q’; let 2 be an inner 
vector of A(Q*), defined as above by an operator X¢=lim £:- [¢(z’) —¢(z) ], 
where we may assume that z’ runs over a sequence of inner points of Q* tend- 
ing to z. In R*, the direction of the vector zz’ tends to that of the vector x. 
Our lemma will be proved if, assuming furthermore that x is in the angle at z 
of a polyhedron in D which does not contain Q’, we show that this implies a 
contradiction. But the latter assumption implies that, if w is a nearest point 
to z’ in the union W of those polyhedra in D which contain z and do not 
contain Q’, the direction of the vector zw tends to that of x; we need there- 
fore only show that this implies a contradiction. 

w must be contained in a polyhedron Q% belonging to D and containing Q’, 
since otherwise it could not be a nearest point to z’ in W. Let Q? be the poly- 
hedron in D of which w is an inner point; this is contained in Q}, and cannot 
contain Q’; it is therefore, by Lemma 7, contained in one of the boundary 
polyhedra Q’ of Q). As there are only a finite number of possibilities for 
Q:, OF, Q’', we may, by replacing the sequence of points 2’ by a suitable sub- 
sequence, assume that these are the same for all w. We now identify a neigh- 
borhood of z in Q) with a convex angle in a Euclidean space R’; as 2, 2’, w, 
Q’, QO’ are contained in Q%, we may, in the neighborhood of z, identify them 
with corresponding points and subsets of that convex angle, and x with the 
corresponding vector in that same angle. 

We have assumed that the direction of the vector zw tends to that of x; 
therefore Q” cannot be the same as Q%, for w is on the boundary of Q}, and x 
is an inner vector of Q’. Therefore Q’ is contained in a boundary polyhedron 
Q’’* of Q; the directions of the vectors zz’, zw tend to the direction of x; 
each point w is in Q’', each point 2’ in Q’’“, and, in the neighborhood of z, Q’* 
and Q’’“ are the same as two boundary angles of the convex angle Q¢; there- 
fore x must be in the angle at z of Q’*7\Q’’“, which, by Lemma 7 (applied to 
Q;), is the union of polyhedra of D, so that x is in the angle at z of one of the 
latter polyhedra. Therefore (applying the induction assumption to Q’’“) Q” is 
contained in that polyhedron, and a fortiori in Q’‘. Hence, applying the induc- 
tion assumption to Q’', we get QC Q?, which contradicts an earlier statement. 

We now define a cellular subdivision of a polyhedron P* as a subdivision D, 
every polyhedron Z; in which is a differentiable cell (of the same class as P*). 
The application of the results of §6 to arbitrary polyhedra depends upon the 
following lemma: 


Lemna 9. Every differentiable polyhedron admits a cellular subdivision. 


This is essentially contained in the work of S. S. Cairns on triangulation, 


| 
| 


1943] THE GAUSS-BONNET THEOREM 125 


and also in a subsequent paper of H. Freudenthal on the same subject("), 
and need not be proved here. 

On a differentiable polyhedron, it is possible to define differentials and 
differential forms in the usual manner. Such a polyhedron will be called a 
Riemannian polyhedron if there has been given on it a positive-definite quad- 
ratic differential form, locally defined everywhere, in terms of local coordi- 
nates aS We make once for all the assumption that our 
Riemannian polyhedra are of class at least C*, and that the g,,, which locally 
define their Riemannian structure, are of class at least C? wherever defined. If 
P* is such a polyhedron, and Q? any regular subpolyhedron of P*, the Rie- 
mannian structure of P” induces again such a structure on Q?; if £‘ are local 
coordinates at a point ¢ in Q?, and the functions z*({) define the local im- 
bedding of Q? in P* at that point, the structure of Q? at that point is defined 
by the form ‘dti, where We shall denote 
by dv(z) the intrinsic volume-element in P* at z, and by dv({) the same in Q? 
at ¢. 

On a Riemannian polyhedron P", satisfying the above assumptions, we 
can define locally at every point z the Riemannian curvature tensor, and 
hence, by formula (1) of §1, the invariant V(z). Let now Q? be a regular sub- 
polyhedron of P*, and ¢ a point of Q”; we shall denote by N"~?({) the normal 
linear manifold to Q” at ¢, which is a submanifold of the tangent space to P* 
at ¢. We denote by I'(Q?, £) the trace, on the unit-sphere, of the dual angle, 
taken in the tangent space to P", of the angle of Q? at ¢. Furthermore, x being 
any vector in N*-*({), we define ¥(f, x| Q?) by formulae (2), (3), (4), (5) 
of §1. 

Let now R°* be a polyhedron in a subdivision of Q?. If s=p=n, we define 
I(Q?, R*) as the integral of V(z)-dv(z) over R*. If s<n, we define I(Q?, R*) 
as the integral of ¥(f, é| R*)dv({) when ¢ describes the set of inner points of R* 
and & describes, for each ¢, the spherical cell T'(Q”, ¢). This implies that 
I(Q?, R*) =0 if the inner points of R* are of type greater than s in Q?, because 
T'(Q?, ¢) has then a dimension less than n—s—1. If, therefore, we consider 
the sum >.,,-J(Q?, R%), taken over all polyhedra Ri of a subdivision of Q?, 
this sum has the same value as the similar sum taken for the canonical sub- 
division of Q?; the value of that sum is therefore independent of the subdivi- 
sion by means of which it is defined, and we may write: 


0) = 


¢ 
the sum being taken over all polyhedra of any subdivision of Q?. 


(#4) See S. S. Cairns’ expository paper, Triangulated manifolds and differentiable manifolds, 
in Lectures in topology, University of Michigan Conference of 1940, University of Michigan 
Press, 1941, p. 143, where references will be found to Cairns’, Freudenthal’s and Whitehead’s 
publications. 


126 C. B. ALLENDOERFER AND ANDRE WEIL [January 


W(z) has been defined by using P” as the underlying Riemannian space. 
If, on the other hand, we use Q” as underlying space, we may, substituting p 
for m in formula (1) of §1 and using the metric and curvature tensors of Q?, 
define the similar invariant for Q”, which we denote by Vo({). Similarly, & be- 
ing a normal unit-vector to R* in the tangent space to Q? at a point ¢ of R’, 
we define ¥,(f, fo! R*) by the formulae, similar to (2)—(5) of §1, where Q? is 
taken as underlying space instead of P*. We also define I'9(Q?, £) as the trace, 
on the unit-sphere, of the dual angle, taken in the tangent space to Q?, of 
the angle of Q? at ¢. And we define I9(Q”, R*) as the integral of Vo(¢)dv(¢) 
over R° if s=p, and, if s<>p, as the integral of Vo(f, £o| R*)dv({) when ¢ de- 
scribes the set of inner points of R* and £9 describes, for each {, the spherical 
cell T'o(Q”, £). By the same argument as above, we see, that the sum 


= 100”, 


taken over all polyhedra R§ of a subdivision of Q?, is independent of that 
subdivision. This sum, taken for the canonical subdivision of Q?, is the same 
(except for slight changes of notations) as the sum that occurs in the right- 
hand side of the formula in Theorem II of §1, when that theorem is applied to 
Q”. With our present notations, we may, therefore, re-state our Theorem II 


in the following terms: 
THEOREM II. For every Riemannian polyhedron Q?, ao(Q”) =(—1)?-x’(Q?). 


We shall first prove that o(Q”) =a(Q2). As o(Q”), ao(Q”) can be defined 
from the canonical subdivision of Q?, it will be enough to prove that, for every 
Q, in that subdivision (that is, either Q? or one of its boundary polyhedra), 
I(Q”, @) =Io(Q?, Q,); and this will be proved if we prove that 


(6’) H(t) = 
.f) 


whenever ¢ is an inner point of Q?, and 


7’ Vv » y= Vv 


whenever ¢ is an inner point of a boundary polyhedron Q* of Q?. But these 
identities have been proved, as formulae (6) and (7) of §6, in the particular 
case when P* is a Riemannian cell; they are purely local, and depend only 
upon the angle of Q? at ¢, the g,, and their first and second derivatives and 
the first and second derivatives of the 2*({) at that point; hence they hold in 
general. 

We now prove the important additivity property of the function o(Q?) 


=(Q?): 


CC 


1943] - THE GAUSS-BONNET THEOREM 


Lemma 10. For any subdivision of P", the formula holds: 
(— 1)"o(P") = 


where >,’ denotes summation over all inner polyhedra (, of the subdivision. 


Call S the sum on the right-hand side. Replacing the o(Q;) by their defini- 
tion, we see that 


S= > (Ops 0°), 


where the sum is taken over all values of r, p, s, o, and €,.,2,- has the value 
(—1)* whenever Q)Q; and Q; is an inner polyhedron of the subdivision, and 
the value 0 otherwise. We may write, therefore: 


S= 


where J,,, is defined by 
> (Qos 0:); 
TP 


the latter sum may be restricted to those values of r, p for which Q, contains 


Q; and is an inner polyhedron. 
We first calculate J,,, in the case s=n; the sum then contains only one 


term, and we have: 
Ine = (— 1)°1Q2, = (— 1)"1(P", Q2). 


We now take the case s<n. From the definition of J(Q,, Q¢), it follows that 
J,,. is the integral of ¥(¢, t| Q:)dv(¢) when ¢ describes the set of inner points 
of Q;, and the integration in &, for each ¢, is over the chain: 


A= f). 


Now I(Qj, §), as a chain on S*~*-', is the same as the outer angle, taken in 
N*-*({) according to our definitions in §2, of the trace on N*~*(f) of the angle 
of Q) at ¢. In the sum for A, we have all those Q) which are inner polyhedra 
of the subdivision and which contain Q%, that is, which contain ¢ (since £ 
is an inner point of Q3); by Lemma 8, their angles at ¢ are the inner angles of 
a subdivision of the angle of P* at [; since all those angles contain the tangent 
manifold to Q; at ¢, their traces on N*~*({) bear the same relationship to the 
trace on N*~*(¢) of the angle of P*; we may therefore apply to the outer angles 
of those traces Theorem III of §2, which gives here A=(—1)*I'(P*, £), and 
therefore: 


Jue = (— 1)°1(P", 2), 
which proves the lemma. 


127 
re 


128 Cc. B. ALLENDOERFER AND ANDRE WEIL [January 


Lemma 10 shows that if Theorem II holds for every cell in a certain cellu- 
lar subdivision of P", it holds for P*; for, if all Q are cells and Theorem II 
holds for them, we have o(Q,) =(—1)*x’(Q) =1, and the right-hand side of 
the formula in Lemma 10 reduces to the inner characteristic of P*, as cal- 
culated from the given subdivision. Since every polyhedron admits a cellular 
subdivision, it will now be enough to prove Theorem II for cells. By §6, we 
know it to hold for an “imbeddable” cell K*, that is, for one in which the g,, 
are defined, as in §6, by a mapping f of K” into a Euclidean space R”. 

We next take the case of an analytic cell, which we may define by taking 
a convex cell K*, and m(m+1)/2 functions g,,(z), analytic over K", such that 
the quadratic form with the coefficients g,,(z) is positive-definite for every 2 
in K". By Cartan’s theorem('*), every point of K* has a neighborhood which 
can be analytically and isometrically imbedded in a Euclidean space. If, 
therefore, we subdivide K" into sufficiently small convex cells (for example, 
by parallel planes), the Riemannian structure induced on any one of the latter 
by the given structure in K" can be defined by an analytic mapping into some 
Euclidean space, and therefore the results of §6 apply to all cells in that sub- 
division. Therefore Theorem II holds for K". 

We now take an arbitrary cell, defined as above by a convex cell K” and 
functions g,,(z) over K", the latter being only assumed to be of class C?; by a 
theorem of H. Whitney('*), the g,,(z) can be uniformly approximated, to- 
gether with their first dnd second derivatives, by analytic functions and their 
derivatives. But the expression ¢(K"), considered (for a given K") in its de- 
pendence upon the g,,, depends continuously upon the g,, and their first and 
second derivatives; for the integrands V are rational expressions in the g,,, 
their first and second derivatives, and the components &, of vector £; the 
denominators in the V consist merely of the determinants g, y, which are 
bounded away from 0; dv(z) is g’/?-dz, dv(f) is y'/?-dt. As to —, we may put 
£,=w-£,; where £, describes the trace of the dual of the angle of K* at ¢ on the 
surface )_,(€,)?=1, which is independent of the g,,, and w= 
Expressing £,, dt in terms of the £,, we get expressions which are continuous 
in the g,,. Since ¢(K") is equal to 1 whenever the g,, are analytic, it follows 
that it is always 1, and this completes our proof. 

Our main result is thus proved in full. Owing, however, to the very un- 
satisfactory condition in which the theory of differentiable polyhedra has 
remained until now, the scope of our Theorem II may not be quite adequate 
for some applications, and we shall add a few remarks which properly belong 


(8) Loc. cit. Footnote 4. 

(4) H. Whitney, Analytic extensions of differentiable functions defined in closed sets, Trans. 
Amer. Math. Soc. vol. 36 (1934) p. 63 (see Lemma 2, p. 69 and Lemma 5, p. 74; as to the latter 
lemma, which is due to L. Tonelli, cf. C. de la Vallee Poussin, Cours d’analyse infinitésimale, 
vol. 2, 2d edition, Louvain-Paris, 1912, pp. 133-135). 


| 
| 


1943] THE GAUSS-BONNET THEOREM 129 


to that theory (to which part of this section may also be regarded as a con- 
tribution). 

One would feel tempted to regard as a differentiable polyhedron any com- 
pact subset of a differentiable manifold which can be defined by a finite num- 
ber of inequalities ¢,(z) =a,, where the ¢, are functions of the same class C™ 
as the manifold; and one would wish to be able to apply the Gauss-Bonnet 
formula to such sets. 

Now, a compact set P determined, on a manifold M”* of class C”, by in- 
equalities ¢,(z) 2a,, the ¢, being functions of class C” in finite number, ac- 
tually is a differentiable polyhedron of class C”, according to our definitions, 
if the following condition is fulfilled: (A) For any subset S of the set of in- 
dices v, consisting of s elements, and any point z of M™ satisfying ¢.(z) =a, 
for ¢€S and ¢,(z) >a, for »&S, the, matrix ||d¢,/dz*|| (where o runs over S 
and yp ranges from 1 to m) is of rank s. In fact, if condition (A) is fulfilled, 
let z be any point of P; call S the set of all those indices ¢ for which ¢,(z) =a,; 
by condition (A), their number s is at most m, and we may take the ¢,(z) as s 
of the local coordinates at z; the neighborhood of z in P is then an image of 
class C™ of the angle determined in R* by the s inequalities x, 20. 

If condition (A) is not satisfied, P need not be a differentiable polyhedron, 
and indeed it can be shown by examples that “pathological” circumstances 
may occur. It can be shown, however, that condition (A) is fulfilled, in a suit- 
able sense, for “almost all” values of the a,, when the ¢, are given. This gives 
the possibility of extending the validity of Theorem II to cases when (A) is 
not fulfilled, by applying it to suitable neighboring values of the a, and passing 
to the limit. Alternatively, almost any “reasonable” definition of a differen- 
tiable polyhedron, more general than ours, will be found to be such that our 
proofs of Lemmas 7 and 8 will remain valid; all our further deductions will 
then hold provided triangulation is possible. 

Finally, it may also be observed that the set P, defined as above by in- 
equalities ¢,(z) 20, can be considered as a limiting case of the set P? defined 
by the inequalities ¢,(z) =0, [],.#,(s) 2¢, where ¢ is any number greater than 
0. The latter is a polyhedron with a single boundary polyhedron P?~* which 
is a compact manifold of dimension »—1; it may be considered as derived 
from P by “rounding off the edges.” We may therefore apply Theorem II to 
P?; and it is to be expected that the formula thus obtained will tend to a 
formula of the desired type when ¢ tends to 0. In fact, this idea could prob- 
ably be used in order to derive our main theorem from the special case of 
polyhedra P* bounded by a single (n —1)-dimensional manifold. 


HAVERFORD COLLEGE, 
HAVERFORD, Pa. 


LINEAR OPERATORS IN THE THEORY OF PARTIAL 
DIFFERENTIAL EQUATIONS 


BY 
STEFAN BERGMAN 


1. Introduction. The taking of the real part of an analytic function of 
one complex variable is an operation which transforms (in function space) 
the totality of these functions into the totality of harmonic functions of two 
variables. Almost every theorem on analytic functions gives rise to a corre- 
sponding theorem in the theory of the latter functions. The similarity in 
structure suggests the use of an analogous approach in the theory of functions 
satisfying linear partial differential equations of the elliptic type, 


(1.1) L(U) = + a(z, 2)U, + + c(z, 2)U = 0, 


In this connection there arises first of all, the question of finding all opera- 
tors of this kind. All operators which transform the class of functions f(z) into 
the class of functions U(z, 2), L(U) =0, (functions of both classes considered 
in a sufficiently small neighborhood of the origin) can be determined by 
formal calculations. 

However the transformation of various results requires that the operation 
be applicable “in the large,” that is to say, that every analytic function f regu- 
lar in a domain 8? of a certain class D may be transformed into a function U 
regular in 8? and that the inverse operator U = P-'(f) possess the same prop- 
erty. 
Further, for many purposes it is important that for various sequences of 
functions f,, the relation lim,.. P[f.(z)]=P[lima.. fa(s)] shall hold(*). 

In addition to the problem of studying all operators and their classifica- 
tion from this point of view, one may consider a particular operator. To a 


Presented to the Society, February 22, 1941 under the title On a class of linear operators 
applicable to functions of a complex variable and December 29, 1941 under the title On operators 
in the theory of partial differential differential equations and their applications; received by the 
editors July 30, 1941, and, in revised form, April 30, 1942. 

() A system of functions {¢,(s)}, y=1,2,-- possessing the property that every function 
denoted as a basis of the class of analytic function with respect to D. 

An operator with the above properties transforms a basis {g,(s)} into a basis {P[y,(s)]} 
of the class of functions U with respect to D. 
to characterize the operation P. 

130 


| 

} 


LINEAR OPERATORS 131 


certain extent it is useful first to study the latter problem, in order to see 
how the different properties of the operator influence the transformation of 
the results, and in order to get a clearer concept of the laws which govern this 


transformation. 

In this paper we shall study the second question. We shall return to the 
first problem at another place. 

NotaTIon. We denote the cartesian coordinates of the plane by x, y. Often, 
however, we shall write s=x-+iy, =x —ty, instead of x and y. We note that 
if we extend the functions considered to complex values of x and y, the vari- 
ables z and 2 are no longer conjugate to each other. 

Manifolds will be denoted by German letters, the upper index indicating 
the dimension of the manifold. 

§? will always denote a star-domain of the (x, y)-plane, with center at the 
origin. Its boundary will be denoted by f'. f' is supposed to be a differenti- 


able curve. 
Further, we denote by E[ - - - | the set of points whose coordinates satisfy 


the relations indicated in brackets.§ means the logical sum. 
I. THE OPERATOR AND ITS PROPERTIES 
2. The class of functions ((E). A survey of obtained results. A complex 


harmonic function h(z, 2) of two real variables x, y can be represented in the 
form 


(2.1) h(z, 2) = F(z) + G(z) 


where F(z) and G(z) are analytic functions of one complex variable. Since we 
can write F(z) = —#) ] (1 —é)—/*dt, (2.1) can be written in the form 


(2.2) 8) = f — #)) + g[@/2) — #))} — 


where f and g are analytic functions, of one complex variable, which are regu- 
lar at the origin. 

As was indicated in [3] (numbers in brackets refer to the bibliography), 
the representation (2.2) can be generalized. Suppose that a, b, c, are analytic 
functions of two complex variables z, %. Then for every equation L(U)=0 
[see (1.1)], there exist functions 


(2.3) E,(z, 2, = 1+ 2, 2), 1, 2, 
such that every solution of L(U) =0 can be written in the form 


Uls, 2) = [ex(- f "edt z, )f((2/2)(1 — #)) 


1 


(2.4) 


132 STEFAN BERGMAN — [January 


For many purposes, instead of considering the functions 


U(z, 2) = exp (- feat) 2, #)f((2/2)(1 — #))(1 — 


it is useful to investigate the functions 


u(s, 2) = U(s, -[exo( 


(2.5) 

Let ((1) be the totality of analytic functions of the complex variable z 
which are regular at the origin. The totality of functions u(z, 2) which can be 
represented in the neighborhood of the origin in the form 


1 
(2.6) uz, 2) = P(f) = f R(s, 2, — — (1), 


will be known as the class(?) @(E). We define E(z 2, #) to be the generating 
function of @(E), f the associate of u, and call the domain in which the repre- 
sentation is valid the domain of association. 

If E, satisfies a certain partial differential equation, the functions, 
UE [exp (—Jéadz)]-@(E:) satisfy the equation L(U) =0. [f(s, 2)- @(E:) de- 
notes here the class of functions f(z, 2)-u, where u€(@(E;).] The totality of 
the solutions of L(U) =0 is given by 


+ [of - fm) } 


where (@*(E) is a class analogous to (, the associates of whose functions are 
analytic functions of 2. The present paper is devoted to a general study of 
the functions of any class @(E), that is, a class of analytic functions of two 
real variables x, y which can be represented in a sufficiently small neighbor- 
hood of the origin by the right-hand member of (2.6)(*). 

In this paper we drop the assumption that the functions u€((E) satisfy 


(?) We may also consider classes of functions for which a representation analogous to (2.6) 
holds in the neighborhood of a point a, a0. Functions satisfying L(U) =0 possess the property 
that the representation (2.4) exists for every point a. The study of the dependence upon the 
point a of E(z, z, t|a) and the associate f(s| a), of a function u is an interesting problem of the 


(*) The relation (2.6) may be interpreted as a mapping (in the function space) of (1) into 
the class (?(E). We are going to study the duality between the theories of the functions of (°(1), 
and those of (°(E). 

Note that our space of functions includes those which are not defined in one fixed domain, 
but only in a sufficiently small neighborhood of the origin. 

These functions arise also in other connections, for example, as a set of particular solutions 


4 
theory. 
2 
i 


1943] LINEAR OPERATORS 133 


a linear partial differential equation. We suppose only that the functions u 
possess the two properties A and B which we now describe. 
A. The function E can be written in the form 


(2.7) E(z, 2, t,) = 1 + #22E*(z, 2, 
where E* is an yp function of two complex variables z, 2 regular in the 
region E[|2| < |3| < ©] and a continuously differentiable function of 3 
tin E[| <o, <1). 
We note that from A follow: 
A;. Every u€@(E), regular in a star-domain §*, can be continued an- 


alytically in(*) 
(2.8) R(F2) = Elz = a + id, Z = a — ib, (DEB, a, b real]. 
A:. For every u€C(E), 
(2.9) | u(z, max | f(s) |, for (z, 2) RF?) 
ter 
and 
c= max | E(z, 2, #) | ; 
(s, EMH), 


B. There exists, for every §?, an operator G(z, 2, £,&,Xo0, - «+ ,Xmn) such 
that 


(2.10) u(z,Z)= gf, F), 8), (2,2) ER4(F*). 


Here f' is the boundary of §', ds; is the line element of f', and G(s, 2, ¢, &, 
Xoo- Xmn), [| Xpql < ©, (bg) = (00), - - , (mn)] is an analytic function of 
two complex variables z, 2, which is regular in R*(#*). 

Since we suppose that E(z, 2, #) is an analytic function of two complex 
variables z, 2 the functions u(z, 2) are also analytic functions of two complex 
variables. In general, they can be continued analytically in the space ranged 
over by two complex variables and, therefore, outside of their domains of 
association. 

In developing the theory of the functions of @(E), one may distinguish 
the following two types of results: 
of partial differential equations of order higher than two, or as solutions of systems of partial 
differential equations. We note that often the pair of solutions of a system of equations may be 
interpreted physically, for example, as the stream and potential function of a flow. 

It should be stressed that our investigations concern the behavior of functions u(z, %) for 
real values of x and y (that is, for s and which are conjugate). However, in some auxiliary con- 
siderations we shall extend x and y to complex values. 

(*) To every point with the coordinates x=a, y=b there correspond planes $*(a, 5) 
=E[s=a+#b] and 0%(a, b) =E[#=a—ib] in the four-dimensional space. Thus, §*) is the 
intersection of two four-dimensional cylinders. 


134 STEFAN BERGMAN ~- [January 


(1) Theorems in which u(z, 2) is considered inside of the domain of asso- 
ciation, 

(2) Theorems concerning the behavior of u on the boundary(®) of 2, as 
well as the properties of u outside of 2. 

Many theorems of the type (1) follow immediately, for the functions u, 
from corresponding results in the theory of analytic functions, by using the 
representation(*) (2.6) and the Corollary 3.1 (p. 136). In particular this is 
true for many theorems stating that an analytic function can be represented 
as the sum of a linear combination of a finite or infinite number of analytic 
functions, belonging to a given set. For instance, this is true for theorems 
dealing with development in series, and on approximation, and the Cauchy 
integral formula, as well as the many consequences of these theorems. (See 
§§5 and 6.) 

In §7 we show that the connection between the position of certain singu- 
larities of u(z, 2) and the coefficients B,, of the development > Banx"y" of u 
is, to a certain degree independent of any special choice of E* (see (2.7), [2] 
and [3]). The same holds for various theorems concerning the connection 
between B,,, and the regularity domain, the growth of u and averages of u 
with certain weight functions. In §9 we study the coincidence, along curves, 
of the values of functions belonging to two different classes. These considera- 
tions show that many properties of the functions u of the class @(E) are 
either independent of the choice of E* (see (2.7)) or depend upon E* in a 
simple manner. 

In particular since functions u satisfying (1.1) can be presented in the 
form (2.4) with E, of the form (2.7) (see [3, §1])(") these results are valid for 
the solution of partial differential equations 


L(U) = 0. 


Since E;* is the only expression in (2.3) which depends on a, 3, c, the resulting 
relations are independent of the coefficients a, b, c of the equation. 

On the other hand, solutions of certain equations L(U)=0 form also a 
class @(E) wherein E is of a quite different form from that here considered; 


for instance wherein 


(5) The study of singularities of functions of (?(E) is a particular one of this group of ques- 
tions. 

(*) In previous papers [3], [4] we proved that the solutions of an equation L(u) =0 can be 
presented in the form (2.4) with E,, =1, 2 possessing properties A and B. With this result we 
constructed two sets of functions {¢,"(z, 2)} which serve as bases of the class (Ex) with re- 
spect to the star-domains. In [4] we discussed the application of this method to the actual solu- 
tion of boundary value and characteristic value problems. : 

(7) The existence of the operator G(s, 2, ¢, ¢, Xoo, X10, Xo) for these functions follows from 
Green's formula. (See [11, p. 515, (9) ].) 

We note that U differs slightly from : (See (2.4) and (2.5).) This should be kept in mind 
when formulating results for solutions of differential equations. 


| | 


1943] LINEAR OPERATORS 


(2.11) E= [ »| » |} 


(See [3, §3].) 


Since various properties of the class @(E) depend to a large extent upon 
E, the study of @(E) with various forms of E gives results which are quite 
different from the results of this paper(*). 

The study of @(E) with E of the form (2.11) seems to be particularly im- 
portant for the study of singularities of functions satisfying L(u) =0. 

It is possible to show that to a pole of the associate function f (of 4) there 
corresponds in this case, a singularity of u with the following property: u 
satisfies two ordinary differential equations in z and 2, with coefficients 
which depend in a simple manner upon p, and q,. (See [3, §2] and [2].) 

3. Determination of f in terms of u(z, 2), (z, 2) EC R*(F?). In this section we 
shall determine the operator 


(3.1) = R(t | 4), 


inverse to (2.6). For the sake of simplicity, we shall deal in the future with a 
certain operator @ instead of R. @ is connected with R by the relation(*) 


(28)? u) 
T(i/2) ag? 
which may be written in the form of an integral relation: 


2 29 


(3.2) u) = 


(cf. [3, p. 1177]). 
THEOREM 3.1. We hav the relation 
(3.4) O(E| = 0). 
Proof. It follows from property A (cf. p. 133), that 


(*) As shown in [3], if E(s, %, ¢) satisfies a certain partial differential equation, then U, 
UEC(E), satisfies the equation L(U) =0. Clearly this equation may have many solutions, any 
of which can be used as E. We note that the equation (1.2) of [3] can be simplified. Introducing 
instead of equation (1.2) becomes +2p(E,, +DE;+FE)=0. E* is con- 
nected with E by the relation (1.8) of [3]. (See also [3, p. 1177].) Finally, writing 
E*=1+0(s, p) we obtain 

For certain purposes it is also useful to consider an operator of the form P(f) =f(s)E,(s, 2) 
+/*E,(s, , s)f(p(s, s))ds or, others of more complicated structure. 

Note that the above operator transforms log z into a function with a logarithmic singular- 


ity. 
(*) We define, as usual, 


136 STEFAN BERGMAN 


(3.5) u(z, 0) = f — #))(1 — #)- 


Suppose now that f(s) °_02,2". Then 


u(s, 0) = f — 


n=O 1 


(3.6) 
= + 1/2)P(1/2)/T + 1). 


Since R (s| u) =f(2) => 7_9a,2", the relation (3.4) follows from (3.2) and (3.6). 
If f(z/2) is regular in *, then it follows, by (2.6) and A, that u(z, 2), too, 
is regular in §*. Theorem 3.1 yields the inverse statement given by 


COROLLARY 3.1. Suppose that u(z, 2), uGC(E), ts regular in a star-domain 
Then f(2/2) is regular in ¥. 


This fact is an immediate consequence of (3.4). For the regularity of 
f(s/2) = R(z/2 | 2) in §* follows by (3.2) from the regularity of O(s| u) in the 
same domain. The regularity of Q@=ux(z, 0) in §? follows from B since the 
domain E[z€ lies in R*(F?). 

4. Determination of the associate function in terms of u(z, 2) in the real 
plane. Relations (3.3) and (3.4) give the representation of the associate func- 
tion. But in this formula there appear functions u(z, 2) for which the values of 
z and 2 are, in general, not conjugate to each other. This means that we con- 
sider v(x, y) = u(z, 2) for complex values of x, y. On the other hand, for many 
questions it is important to have a formula where v(x, y) appears and takes 
on only real values of the arguments. We obtain such a formula by substitut- 
ing the right-hand member of (2.10) for u in (3.4). (See [5, §3].) However, this 
last formula is inconvenient because the expression obtained for @ depends 
on G, and therefore on @(E). Because of the importance of formulas for R, 
we shall indicate other expressions for @ which are independent of @(E). 


THEOREM 4.1. Suppose that u(z, 2)E€C(E) is regular in E[x?+y? <4R?]. 
Assume r<R. Then 


(4.1) or) = ree — 


Proof. By (2.7) we have 
(4.2) u(z, 2) = Q(z| + | u) = >> Amos”. 
m=O 


m=ln=1 


The series (4.2) converges uniformly and absolutely(*) in E[|2| <R, |z| <R]. 


[January 
| 
n=0 


- 1943] LINEAR OPERATORS 


Integrating along the circle || =r of the real plane, we obtain 


ra ee @r* 


Now )>7.oA»+¢v¢” is an analytic function of the complex variable ¢, lt | SR. 
We introduce a new variable Z={—r, and develop this function about the 
point Z=0. Then putting Z = —r, we obtain 


Equation (4.1) follows from (4.4). 
REMARK. Analogous considerations yield Q(s| «) in terms of Re(u) and 
Im(u), respectively. In fact, by (4.2) we have 


u(s, 8) + = Q(s| + + 


n=l 
+ > A 
m=1 n=l 
(12%) = (Ae + Aon) + (Aur + Aude? 
0 
+ (An 

(1/22) f (u + = A grt + + 


+ + + 
for g21. Similarly, 


(1/2) (u — = (Aoo — Ao) + (Au — + 


In the same way as before, the associate function 
(4.5) = T(e| u™) + u™) = + 


can be determined (to within A¢o) from either the real or imaginary part of 
=u 

For some purposes, it is convenient to have a formula for Q(s| 1%) in which 
no derivatives of u appear. In order to obtain such an expression, we need 
certain lemmas. 


137 
| 
| 
| 
| 
| 
| 


138 STEFAN BERGMAN ~- [January 


Lema 4.1. Let r'=E[0S1;SrS1r2<R]. There exists a set of functions 
¢,(z) such that 


= 0, for 


(4 .6a) 


(4.60) J for 


=0 for v¥Au. 


Proof. As is well known, the system { (n/z)"Z*-"} is orthonormal in the 
unit circle, | Z| <1. Set h(Z) g(Z) =) The 
Hermitian form 


o 1/2 Pr 
f h(p)g(o)dp = f p™t™—*dp 
(4.7) o « (nm)¥2 


is completely continuous (vollstetig), since }\*_,>>%..93'"~' exists (see 


[9, pp. 147-151, especially p. 151]). Therefore, (4.7) can be written in the 
form 


(4.8) ( > onbn) 


n=l 


where {0,.} is a unitary matrix. Since {o,.} is unitary, the functions 


n=l 
have the property that 


f = 1, f for 
IZ|<1 


= 0, = 0 fory ¥ uw. 


Substituting in (4.9) Z=2/R, px=ri/R, 4, $,(2) =¥,(2/R)/R, we ob- 
tain a system of the desired form. 


Lemna 4.2. Let g(z) be an analytic function of one complex variable z, regular in 
|z| which takes on the values F(r)on t',and for which *dxdy<o. 
Then 


(4.10) 


| 
‘ 
| 
\4 


1943] LINEAR OPERATORS 139 


Proof. By theorems on orthogonal functions [7 p. 26], it is seen that g(z) 
can be represented in |s| <R in the form g(z) =)_~ ,A.#,(s), where the series 
converges uniformly in | =p<R. Since > 2.44.(r) = = F(r) for r€r', the 
relation 


Adler = DA, f = 
ri rl ri 


gives us \,A,=/uF(r)6,(r)dr, which yields (4.11). 
REMARK. From (4.11) and (4.9) we have 


(4.12) £(0) = R* (on/r) f 


THEOREM 4.2. Under hypothesis of Theorem 4.1, we have 


2 Ond(r) u(re'®, rete 


Proof. By (4.3) (with k=0) and (4.12), we have 
u(re‘*, 
4.14 Aw = —— 
Since u) =) the relation (4.13) follows. 
THEOREM 4.3. Under the hypotheses of Theorem 4.1, we have 


rr u(re‘?, re-**) do 
(4.15) w) = (w + 1/2) Pa(0) 


where the functions P,, are Legendre polynomials. 


Proof. Integrating (4.2) multiplied by e~** along: the circle |z| =r, 
rr: <R, of the real plane, we obtain 


(4.16) = f 


Since >>" 5A.+¢r” is a function of bounded variation, it can be developed 
in a uniformly convergent series 


D 
v=0 


(4.17) 


Substituting r=0 we obtain (4.15) from sacaaets 


140 STEFAN BERGMAN . [January 
II. DUALITY BETWEEN THE THEORY OF ANALYTIC FUNCTIONS OF 


ONE COMPLEX VARIABLE AND THE THEORY OF FUNCTIONS OF ((E) 


5. An integral formula for functions of @(E). In this section we shall de- 
velop an analogue of the Cauchy integral formula. 


THEOREM 5.1. For every point Z there exists a function H(z, 2; Z)EC(E), 
regular in the region E[|s| < © ]—E[s=Zs, 2<s< @ ], such that every function 
u(z, 2)EC(E) can be represented by the line-integral 


(5.1) u(z, 2) = (1/2mi) R(Z| u)H(z, 2; Z)dZ. 


a! is an arbitrary rectifiable closed curve, lying in the domain of association of u, 
and such that the origin lies in its interior. 
Proof. By (2.1) and the Cauchy integral formula we have 
1 
u(z, 2) = f E(z, 2, é)f((2/2)(1 — #))(1 — 


1 


(5.2) = (1/28) f BG, 2, f sare - - 


The expression in the bracket belongs to @(E); designating it by H(z, 2; Z), 
we obtain (5.1) by (3.1). 

6. Development in series and approximation in ((E). If h,(z) converges 
to a limit function h(z) for z€ §*, no, then by (2.9), 


(6.1) lim [P(hn(2))] = P(A(z)), 


This fact enables us to prove, in the theory of functions of @(E), a large 
group of theorems dealing with normal families, on development in series, 
and on approximation. 

Examp Les. I. Suppose a sequence u,(z, 2), m=1, 2, - - - of functions regu- 
lar in is given, with u,€((E,). Let, furthermore, lim,...E,(z, 2, #) = #) 
for (zs, 2)€ —1 Finally, let omit (that is, fail to take on) 
two distinct values. Then u,(z, 2) form a normal family. 

II. As is weli known, there exist sets of functions {f,(z)} possessing the 
property that every function f regular in a domain § can be therein repre- 
sented in the form f(z)=) a,f,(z), where the series converges uniformly 
in every subdomain T?, T*C §*. To every such theorem corresponds the fol- 
lowing analogue: For the domain § there exists a set of functions u,(z, 2), 
u,(z, 2) =P(f.(s)) such that every function u(z, 2)€@(E), regular in 


j 
4 
+ 
12 


1943] LINEAR OPERATORS 141 


$?, can be represented in the form u(s, 2)=)>_~,a,u,(z, 2). This series con- 
verges (uniformly) in every T*C §?. In the same way, every theorem stating 
that f(z) can be approximated by >°?.,a,"f,(z) in every subdomain &? of §?, 
has an analogue which can be proved in the theory of @(E). We note that in 
certain cases, it is possible to approximate u(z, 2) in 3? (cf. [4]). 

The set {2"} plays an important role among the sets of functions f, men- 
tioned above: There arises the problem of characterizing the functions P(z’-") 
independently of their integral representation. This is, in fact, possible if E 
satisfies a certain differential equation. For then the P(z’-") satisfy an 
ordinary differential equation. (We shall consider this question in another 
paper.) In particular, the previous results. yield: Every function u(z, 2)€@(E) 
can be developed in every circle of regularity, || <p, in the form )_*. ,a,P(2) 
and it can be approximated by }>.*_,aP(s’—) in every regularity domain §*. 

In addition, our method enables us to prove immediately many other 
theorems concerning the degree of approximation. For instance: Let w=d(sz) 
map conformally the complement of §? into | w| >1, and denote by G} the 
curve d(z)=R>1. If u(z, 2) is analytic in a domain §*DG€}, then there exist 
expressions p,(z, such that 


lim sup [max | u(z, 2) — pa(z, 2) |/"] = 1/R. 


This result is an immediate generalization of the corresponding theorem of 
Walsh [12]. 

7. Coefficient problems. In an analogous way other results (for instance, 
those on overconvergence, on existence of boundary values, various gap 
theorems, and so on) can be proved in the theory of functions of the class 
(@(E). In §6, we introduced the system P(z”—")y=1, 2, - - - . Weindicated that 
the series 


(7.1) 


has a behavior analogous to that of a power series in the case of analytic func- 
tions of one complex variable. In particular, one can deduce various proper- 
ties of u(z, 2) from the behavior of the coefficients a, of its expansion (7.1). 
On the other hand, the function u(z, 2) can be represented in the neighbor- 
hood of the origin in the form 


(7.2) > or > > B,,x’y", 
r=0 


the series converging in E[|s| <p, |2| <p], p sufficiently small. The problem 
now arises of finding properties of u from the behavior of A,, or B,,. 
By (3.4) we have the relation 


142 STEFAN BERGMAN ; [January 


(7.3) am = CmA mo, Cm = + + 1/2) 


for the coefficient a,, of the function f, which is the associate of u. Thus, if 
some property of Amo is known, the corresponding property of a, follows by 
(7.3). Then, using the theorems of the theory of analytic functions of one 
complex variable, which deal with the relation between the function and the 
coefficients of its series development, we may obtain results concerning the 
relation between the function u(z, 2) and the coefficients Amo of its develop- 


ment (7.2). 
ExamPLEs. I. The radius, r, of the largest circle with center at the origin, 


inside which u(z, 2) =A ma3™2" is regular, is given by 
(7.4) 1/r = lim (| Ano| ¢n)*!*. 


II. Suppose now that A ,9 = 0 in (7.2) for all n,except for m=i,, v=1,2,---, 
where A,4:—A,>A,0, 9>1. Then u(z, 2) cannot be continued analytically to 
the outside of the circle whose radius is given by (7.4). 

III. A classical result of the theory of entire functions states: Let 
f(z) =>0a,2" be an entire function. The logarithm of the greatest of the terms 
|a,r*| is asymptotically equal to log[maxos,<:e|f(re)| ]. A similar result is 
valid in the case of entire functions u(z, 2)€@(E). Namely, we have the in- 
equality 


(7.5) | u(re’*, re-‘#) | < | | max* Emax(?)-(R/R —r), 


where 


| = max | AnoR*|, Emax(r) = max | 2, 


For we have 


1 n=0 


< Emax(r): Anor™| S | AnoR* max: (R/R — 1). 


An inequality for A,» in terms of MaXxosesze| u(re*, re~**)| follows from 
(4.14). 

The relation (7.3) enables us to give interesting formulations of many 
theorems which have analogues in the theory of functions of @(E). For 
instance a generalization of a theorem of Fatou type was given in [5](*). 
Since the coefficients of the associate function can be expressed by Ano in 
the form (7.3), it follows from [5]: If }>R~0| Amo|?< ©, then A 
=P(f), «€C(E), possesses boundary values almost everywhere on the unit 


(*) We note that on p. 668, in 1.14 of [5] it is necessary to add to 1+s8PE* the factor 
exp(—f}Adg). 


H 
n=0 
¥ 


1943] LINEAR OPERATORS 143 


circle. Further, the set of points in which these boundary values exist, in- 
cludes the set €, Cbeing the set of points in which —#))(1 —#) 
has boundary values. 
A generalization of Hadamard’s multiplication theorem was given in [2]. 
For certain applications it is useful to obtain an expression for A», in 
terms of Amo. Such a formula follows from the fact that 


u(z, 2) = + 21 — 422) 


is an analytic function of two complex variables. Thus, we have 


1 1 
ond 
(7.6) 0 0 1 


Using the general integral formula (see [6]), we obtain integral formulas 
with various ranges of integration. 


III. SOME PROBLEMS INVOLVING FUNCTIONS OF CLAss ((E) 


8. Conjugate functions. Mapping by the functions of @(E). The functions 
«u considered in this paper are complex. In most applications (theory of 
linear partial differential equations) we need consider only their real part. 
We wish to indicate a problem which involves both the real and imaginary 
part of u. 

The equation (1.1) is equivalent to the system of two equations 


(1/4)au + (1/2)AU.” + (1/2)BUS” + (1/2)cU + 
ap? 0, 
— — (1/2)DU? + (1/2AU + (1/2)BU,” 
(1) (2) 


+ 62U + = 0 


(8.1) 


where 
U=U%+ c=atic, (+5), 
B= (1/2) 
D = (1/2)[(a + 4) — (6+ 5). 


On the other hand, every solution of (1.1) can be written in the form, 
exp(— foadz) -u(s, 2)+exp(— febdz) -v(z, 2), where v(z, 2) belongs to a class of 
functions, whose associates are anti-analytic functions (that is, analytic 
functions of 2). Thus the functions of @(E), with an appropriate E, form a 
subclass of the functions satisfying the system (8.1). However, if a=}, and c 
is real, then the equations (8.1) are independent of each other. 


144 . STEFAN BERGMAN [January 


Relations existing between the real and imaginary parts of u, in the 
general case, are given in 


THEOREM 8.1. Let u=u%+in®CPE), E=E®+iE®. Further, let 
T(s| u®)+iC(s| u™) be the associate of u (see (4.5)). Then 


1 
ue” = Ey )7(s| ty at 


-1 


1 


-1 


(8.2) 


—1/2 


dt 


1 
-1 


(8.3) 


2.—1/2 


1 


We obtain (8.2) and (8.3) by differentiating u and using the Cauchy-Rie- 


mann differential equations for the associate. 
In addition to uw and u® we may consider the pair of functions »™, v®, 


where +iv® = 2, t)f[(2/2)(1—#) ](1—#)-"dt_ and 
- (EP +E”). 
The functions u™ and v™, k=1, 2, are connected by the equations 


(1) (2) (1) (1) (2) 
zs 


(8.4) ty + us 


It follows that many relations exist between u™ and v™. For instance, if 
u™ satisfies a (self-adjoint) partial differential equation of elliptic type and 
second order, say Z,(u“) =Au +4cu™ =0, ¢ real, then a generalized Cauchy 
formula is valid. It determines the values of u™ inside a domain %{? in terms 
of the values of u™ and v™, k=1, 2, on the boundary a’ of %*. Using the 


formula (9), p. 515 of [11] we obtain 
2ru(x, y) = — uOW, + vo 


+ + — W)dn], (x, y) 
Here W(x, 9; &, 7) is a fundamental solution of Z;. 
REMARK. Clearly, if u™ are connected by the generalized Cauchy-Rie- 
mann equations, 


(k) 0, ge 1, 2, 


2 

(e) (se) 
D + + osu + 
kml 


(2) (1) 


; 
} 
|| 
fis 
a4 


1943] LINEAR OPERATORS 145 


a generalized Cauchy formula can be obtained without introducing o™, -»®, 
If L,(u™) =0 then 


+ (— + D.W)u + E,W]dt — [(— (WA), + WC, 
— Wu” + (— (WB), + WD,)u + WE; ]dn}, 
(x, € W*. 
Ax, By, - - - are polynomials in a®, they are the coefficients of the expressions 
up = + + Diu” + Ey 
= Ame + Buy 

In analogy with conformal transformations, one may consider the map- 
ping of the (x, y)-plane by the functions U(s, 2) of the class exp(— /qad2) - @(E). 
If U satisfies (1.1) then this mapping represents a transformation by a pair 
(U™, U) of solutions of the system (8.1). The following case is of special 
interest. Suppose that the boundary f' of § can be decomposed: f!=§?_ifi. 
Suppose further, that by the transformation U=U/(z, 2) every curve-seg- 
ment f} is transformed into g3=E[y,(U™, U®)=0], v=1, 2,---,m. The 
pair (U™, U®) then represents a solution of the system (8.2), (8.3), satisfying 


the boundary condition 


¥(U®, U®) =0 on fi, y=1,2,---,n. 


9. The coincidence of functions of different classes along curves. In this 
section we investigate the problem: when can two functions of different 
classes, or at least their real parts, coincide along a curve. Results in this 
direction are especially of interest if one of the two classes is the class of 
analytic functions of one complex variable. 

We shall indicate some applications of the results in this direction to the 
boundary value problem(”), and to the characterization of singularities. 

Let 2, t) possess the property that for (x, y), s=x-+ty, =x —iy, belonging 
to a curve f of the real plane, we have 


(9.1) E(z, 2, t) E,(z, (x, y) 


(#) In analogy with the theory of partial differential equations, we may consider the 
boundary value problems for the functions of the class @(E). Since a function which satisfies 
(1.1) can be represented in the form (2.4), the boundary value problem for L(U)=0 can be 
reduced to that of functions of the class (?(E). We note that if a=5 and c is real (see (1.1)), we 
may write U(z, 2, 

In this case our later results can be directly used in the theory of partial differential equa- 
tions. 


146 STEFAN BERGMAN ~ [January 
where E,(s, t) ts an analytic function of one complex variable regular in a (suffi- 
ciently large) domain of the (complex) z-plane. Then 

(9.2) u(z,Z) = h(z) for (x, 
where u(z, 2) is the function of @(E) introduced by (2.6) and 


1 1-# d 


is an analytic function of one complex variable. 
Examp te I. We have 


(9.4) , = E(s, 2ic +2, #) = E,(z, for (x, y) = Ely = — c]. 


EXAMPLE II. Suppose that 
(9.5) E(z, 2, 2) = E,(r, 2) = 1+ = + y?, 
where E,(r, ¢) is a function which is independent of ¢. 

(See also (2.3).) Here r and g are polar coordinates. Then we have 
(9.6) E(z, 2, 4) = #) for (x, y) = Elz + y =p ]. 


We now shall discuss the above mentioned applications of the coincidence of 
the functions u and h-on f' (see (9.2)). 

1. Boundary value problem. We consider at first the case where E is of the 
form described in Example II. Let u(z, 2)€@(E), where E satisfies (9.6). If 


for all integers n, n20, 
1 
J =f E,(p, — #)*-"/*dt 0, 
=3 


then a(r, ¢) =u(re*, re-*) can be represented in the domain E[| s| <p] in 
the form 


#) = f - we 


1 f 2(8)E,(r, 7) H(be-**) drdadd 


(2)? 1 — — (1 — 


where v(8)=&(p, 8) is supposed to be an absolutely integrable function, 
r/p<b<1, and 


(9.8) H(be-**) = >> Ti 


| (9.3) A(z) = Fe 


1943] LINEAR OPERATORS 147 
We obtain (9.7) by formal calculation, since it follows from Lemma 9.1 that 
the first series converges uniformly for r Spo<p. 


LEMMA 9.1. For every €,e>0, and every p, pSpi< © there exists an no, such 
that for n>no we have 


s|f Ei(p, — — #)"*dtis 1+ 6, 
(9.9) 
I'(n + 1) 
Proof. By (9.5) we have 


-1 -1 


(See [10, p. 133 formula (2) ].) Since E;*(p, ) is supposed to be regular, there 
exists a constant c, such that |Ei*(p, ¢)| Sc, and therefore 
! 1 +1 
| PEF(p, (1 sc — 
-1 


£16 1/2) 


2 I'(n + 2) 
Hence from I'(n+2) =(m+1)I'(m+1) we have 


j vag, + 1/2) 
s|f #) nat / Ta +) 


1+ 


2(" + 1) 


which yields (9.9). 

Since in (9.7) we supposed rSpo<p the absolute and uniform con- 
vergence of the series in the first integral of (9.7) follows from (9.9). 

An analogous formula can be derived if the derivative 04(r, ¢)/0r is pre- 
scribed along f}. 

REMARK 1. If a (real) function satisfying L(U) =0 and assuming given 
real values on the boundary has to be determined, we take for v(g) such an 
analytic function of one complex variable that Re[exp(— -v(z)] as- 
sumes the given values of f}. 

REMARK 2. In our above considerations, the existence of functions 
_ u(z, 2)€ CE) satisfying the required boundary conditions was presupposed. 
However, there exist cases where the proof can be given without a preliminary 


| 


148 STEFAN BERGMAN — [January 


existence hypothesis(*). For instance, suppose that a (real) function H(¢) 

is chosen in such a way that H(y) =Re[h(pe)], where h(z) is an analytic 

function of one complex variable such that | |p>dp < p > 1. 
Let >> (a, cos np+5, sin ny) be the Fourier development of H(y). Then 


Dd Cn = On — 


will be an analytic function, regular in ?=E||s| <p] the real part of which 
converges almost everywhere to H(g) when we approach & radially. It fol- 
lows by Lemma 9.1 that 


(9.10) LD n(p) 


converges uniformly for every rSpo<p. Thus (9.10) is a function of @(E) 
which is regular in §*, and it suffices to show that it converges to h(pe*”) as 
r—p. We shall show that 


om 


converges (uniformly in 7) to zero, as mp—> ©. We have 


eine > ( ) eine 
ante 


p 


J a(p) 


m n 1 +1 
f [E,(p, 4) —E,(r, / f Ex(p, 
n=ng Pn -1 

Since E,(p, #) is an analytic function of the real variable p, it satisfies, for all 
r, a Lipschitz condition 


| Ex(o, — Ex(r, )|< Ci] p 
where C;, is a fixed constant. 


Thus (9.11) is smaller than 


m 1 1 r* 
rll (1 — / E,(o, #)(1 — |. | cn | 
nang -1 p 


(#4) In this case we thus obtain the proof of the existence of a function u(s, #)€((E) regular 
in E[|s| <p], the real part of which assumes the prescribed values almost everywhere on the 
boundary E[|s| =]. (We therefore obtain, in certain instances, an existence proof for solutions 
of partial differential equations.) 


| 
m 
bd 
| 
ts 


1943] LINEAR OPERATORS ; 149 


By a known result ¢n| Se:(0)/(o —r) where lim =0. (See 
[8, pp. 405-408 ].) Thus, by (9.9), (9.11) is less than 


€1(t0) [1 + (10) ] = lim ex(#) = 0, 


This completes the proof. 

In the last paragraph of §8, we considered, a function u(z, 2) =u%-+tu™ 
which maps a domain §*+f! into the domain g', (f'=S?_,f, g! =S?_193). 
The functions u™ and u® are solutions of a system of partial differential 
equations for which the boundary conditions are: y,[u™, u®]=0 on fi, »=1, 

If §*=R2=E[|z| <p] and if y,(u™, u™) are linear functions of 
and uu, the solution of the above boundary value problem for a pair 
of harmonic functions can be written (in special cases)(“) in the form 
u =¢ fe This result can be generalized for the func- 
tions u(z, 2)€ @(E), E satisfying (9.5). In fact, since by (9.2) h(z) and u(z, 2) 
coincide on &}, the determination of u(z, 2) can be reduced to the finding of a 
function f(z) which satisfies the integral equation 


i-#f dt 2 


h(z) being the analytic function satisfying the prescribed boundary condi- 
tions. We can develop f(s) and h(s) in power series in the domain §3. Writing 
and comparing coefficients, we get 


where H is the function introduced in (9.8), and lg | <b<1. Since in the case 
considered we have h(z) =f, | we have 


u(z, 2) f 7 E(z, 2, 2) H(be-**) 


Ul (n — ay] (1 — #)—/*dndadt, s = 


(9.14) 


(4) We note that in the case considered u™ and «® satisfy the Cauchy-Riemann equations 
in addition to the potential equation. 

Since u™+iu® =exp(/fads)U (see (2.5)) the relations Aju+B,u®+C,=0, »=1, 2, 
+++, ; Ay, B,, C, being constants, become (A,+B,)p, U+(A,—B,)p,U® +C,=0 where p and 
f: are the real and imaginary parts, respectively, of exp(/3ad3). 

We remark that when dealing with differential equations, especially in connection with the 
coincidence problem, it is often useful to consider classes (?(E) with a generating function E 
which does not fulfill the hypothesis A (see p. 133). 


| 
ta 
} 
= 
| 
| 


150 STEFAN BERGMAN [January 


In the case I (see p. 146) we can proceed similarly. However, the de- 
termination of f from (9.3) is slightly more complicated. Let h(s) be the 
analytic function, regular in 8? =E[y>—c], c>0, which assumes the given 
values on the curve f}=E[y=—c]. Since it is a convex domain containing 
the origin, there exists, by Corollary 3.1, an analytic function f(z) = R(s| u) 
(see (3.1)) such that 


1 i-— 2 d 
(9.15) f E(z, + 2ic, = h(z). 


1 


Let }\a,2" and >>A,s"* be the function elements of f and h, respectively, at 
the origin and suppose E(z, s+2ic, t)=)_P,(#)s". Then we have 


II 


E,,0 E,-1,1 E,-2,2 


Proof. A formal calculation yields, 
1 
n=O k=O 


By a comparison of coefficients, we have >-*.9a,Ex-».=An, which yields 
(9.16). 

2. Residue theorems. There exists a simple method for the construction of 
functions of the class @(E) with certain singularities In fact suppose that 
f(s) is a function, which is regular at the origin and possesses a singularity 
at the point a. For example, take f(z) =(z—a)-. The function("*) u(z, 2) 
=P[(s—a)-*] will belong by definition (see §2) to the class @(E), and will be 
defined by the integral representation (2.6) over the domain B? =E[| s| <a] 
—E[s=2aS, 1<S<©@]. As we shall show immediately, u(z, 2) is also 


(5) Sometimes it is useful, for the construction of functions with singularities to use 
operators slightly different from (2.6), for example, operators of the form P,(f)=E,(z, 2)f(s) 
+/Ea(s, 2, t)f[p(s, t)]dt, or involving double integrals. If f(s) becomes infinite in such a way 
that the above integral is a regular function of s, 2, then P,(f) possesses a singularity of the same 
character as f. (See also §1.) In particular, this method yields fundamental solutions. 

Note that by the method indicated in [3, especially, p. 1173] various representations of 
this form can be easily obtained, E, and E, being soluticns of certain integral or differential equa- 
tions. 


| 
Eo,o 0 0 -++ Ao 
E 0 
1,0 0,1 1 
(9.16) 
i 
5 


1943] LINEAR OPERATORS 151 
regular on E[s=2aS, 1<S<]. Let #=Re*”, R=2|a|S, S>1, and ¢ 
=arg a. The function [z°((1 —#)/2) —#)-” considered as a function 


of the complex variable t=4,+-i#, possesses two poles, namely at the points 
= and = —(1—S“)"*. We write 


= E[-1<4<1,4=0], 


2 
= E[-1<4<1,4=0] — § Ej-e+i™ <4 46,4 =0] 
k=l 


2 
+ § Eft = + eet, 180° < < 360°], 


k=l 


2 
= E[-1<4<1,4=0] — § <t+4+64=0] 
k=l 


2 
+ Eft = + < < 180°], 
kewl 


€ being sufficiently small. 
For every 2) = Re¥, y<arg a, we have 


[(2/2)(1 — #)— a} = [(s/2)(1— #) — 


since Sj can be reduced to S} without cutting across singularities. The second 
integral exists, and represents an analytic function even for 2 =2, since 
the integrand is regular on ©}. It follows that fo [(2/2)(1 —#) —a]-"(1 —#) dt 
is a regular function of z in the point z™. However, in general we shall find 
different values for this function if we approach first from the left and then 
from the right, since ©} cannot be reduced to S} without cutting the poles at 
t®, ¢@), Hence the point 2a will in general be a branch point of P[(s—a)-]. 

Suppose that f(s) possesses a denumerable number of singularities which 
have no accumulation point within a finite distance of the origin. Then an 
analogous consideration shows that P(f) will possess, at corresponding points, 
singularities which are, in general, branch points of P(f). In that case, the 
integral formula (2.6) represents one branch of P(f). 

Now the problem of characterizing these singularities arises(). If the 
functions u(z, 2) belonging to a class @(E) coincide with analytic functions of. 
one complex variable, along certain curves, we may use this fact for one 
kind of characterization of singularities. 

The procedure which can be applied may be demonstrated in the case 
where E satisfies the relation (9.5). 

(*) If E(s, 2, #) satisfies certain differential equations,the function P[(s—a)~*] satisfies 
ordinary differential equations, with coefficients which are connected with E(z, 2, ¢) in a simple 
way. For details see [2]. 


= 
| 
f 
| 
Bees 
| 
} 
1 


152 STEFAN BERGMAN ~* [January 


Suppose at first that u(z, 2) is regular in the circle R*=E[|s| <p]. Then 
by Corollary 3.1, f(z), (and hence also, by (9.3), 4(s)) is regular in *, and we 
have 


(9.17) u(z, 2)ds = h(z)ds = 0, = E[| s| =p]. 


Thus under the conditions indicated above, the line integral (9.17) taken 
along a circle vanishes if f lies in regularity domain of u(s, 2). 

Suppose now that f(z) has a pole, that is, say we have f(s) = (s—a)— and 
p> 2a. Since the point 2a is a branch point, f'=E[|z| =p] will be now an 
open curve on the Riemann surface of u(z, 2). (It lies in the sheet in which the 
representation (2.5) is valid.) Both end points of f' lie on the slit 


Elz = 2aS,1 5 S< @] 


(but of course in different sheets of the Riemann surface). 
We have, then, 


tom ¢) 


P[(s — = f Bale, — 


tm t 


(9.18) = — (— —2 | a| = 1, 2. 


Proof. The left-hand member of (9.18) can be written in the form 
1 
(9.19) f Ex(p, [(2/2)(1 — #) — — 


The integrand of (9.19) is an absolutely integrable function. We have there- 
fore 


t=} t<1 


Changing the order of integration, the residue theorem then gives 
f Ei(p, é)dz (1— 


0 for —1 <#<2, 
0 for <i#<1, 


for <t< 


which yields (9.18). 

The analogous formula for P[(s—a)], n>1, an integer, can be obtained 
in the following way: The integral, in a certain neighborhood of every point 
a=a® for which | | *p is a regular function of a. 

Differentiating (9.18) m times with respect to a we obtain 


. 
| 
; 
wy! 


1943] LINEAR OPERATORS 153 


{= 
(9.20) 1)*m! P[(s — a)- 


(Note that a appears only in ¢ and ?®.) 

10. A connection with a class of difference equations. There exists an im- 
portant connection between differential and difference equations. In particu- 
lar, some of our previous results can be used for the theory of difference 
equations of the type“” 


(M + 1)(N + 1)y(M + 1, N + 1) 
Si Ki 
+ asx(M + —-S+1,N K) 


S=0 K=0 
Ka 


+ > Bsx(N — K + 1)y(M — S, — K +1) 


S=0 K=O 
Ss Ks 


+> -—S,N— K)=0. 


S=0 K=0 


Here agx, sx, and Ysx are constants. 
THEOREM 10.1. Lei 
(10.2) U(s, 8) = 2) v(m, 
mad 


be a solution of (1.1), where 
Ss Ke 


(10.3) a= > b= >) ond c= > > 
bad b=0 


o=0 k=O 
Then y (M, N) is a solution of the difference equation (10.1). 
Proof. We have 
Ua = (M + 1)(N + 1)¥(M + 1, N + 


S=0 K: M K=0 
8: Ks 8: 


D = - K + 1)y(M — S, N — K + 1)s™3", 


K=0 
S=—0 K=0 M N K=0 


(@") We note that the difference equations in two variables have been treated very little 
by analytical methods. As far as I know the only results in this direction were obtained by C. R. 
Adams, On the existence of solutions of a linear partial pure difference equation, Bull. Amer. 
Math. Soc. vol. 32 (1926) p. 197. 


| 


154 STEFAN BERGMAN - [January 


Since U is supposed to satisfy equation (1.1), the function ¥(M, N) must 
satisfy (10.1). 

REMARK. In speaking of a solution, ¥(M, N) of (10.1), we shall in the 
future always suppose that the ¥(M, WN) have the following property: 
There exists a number p >0 such that A N)|p™+” < Under 
this hypothesis, it follows that conversely, to every solution ¥(M, N) of 
(10.1) there corresponds a function U(z, 2) given by (10.2) and satisfying 
L(U)=0. 

The connection indicated in Theorem (10.1) enables one to reduce many 
problems of the theory of difference equations of type (10.1) to that of func- 
tions satisfying L(U) =0. Then the application of the methods of the theory 
of partial differential equations may yield the desired result. 

As an example of such a procedure, the following problem may be con- 
sidered: 

To give a representation of the solution ¥(M, N) of (10.1) in terms of 
¥(N, 0) and ¥(0, NV), n=0, 1, 2,---. 

By Theorem (10.1) (cf. also the remark following it) this problem is 
equivalent to finding the coefficients of U satisfying (1.1), where the functions 
a, b and ¢ are given by (10.3). On the other hand, by (7.6) we have for the 
coefficients A», the relation 


¥(M, N) = 
(10.4) 


UVido 0 


—1 kel 


Here 


H, = exp E f a(rje** + 
0 


(rr 
= exp [- f b(z, — ines], 
0 


= ¥(M, 0) + — 


hi M! 2 


neo 


and the functions E, kR=1, 2, are gen- 
erating functions of the totality of functions satisfying (1.1) when a, b, c are 
given by (10.3). 

The representation (10.4) enables us to draw various conclusions con- 
cerning ¥(M, N). For instance, the growth properties of ¥(M, N) (considered 


‘ 

‘ 


1943) LINEAR OPERATORS 155 


as a function of M and N), in terms of the growth properties of ¥(N, 0) and 


¥(0, N), can be obtained from this relation. 
BIBLIOGRAPHY 

1. Stefan Bergman, Anwendung des Schwarzschen Lemmas auf eine Klasse von Funktionen 
von swet komplexen Verdnderlichen, Jber. Deutschen Math. Verein. vol. 39 (1930) pp. 163-168. 

z. , Sur un lien entre la théorie des équations aux derivées partielles elliptiques et celle 
des fonctions d'une variable complexe, C. R. Acad. Sci, Paris vol. 205 (1937) pp. 1198-1200 and 
1360-1362. 

3. , Zur Theorie der Funktionen, die eine lineare partielle Differentialgleichung be- 
friedigen Rec. Math. (Mat. Sbornik) N.S. vol. 2 (1937) pp. 1169-1198. 

4. , The approximation of functions satisfying a linear partial differential equation, 
Duke Math. J. vol. 6 (1940) pp. 537-561. 

5. , Boundary values of functions satisfying a linear partial differential equation of 
elliptic type, Proc. Nat. Acad. Sci. U.S.A. vol, 26 (1940) pp. 668-671. 

6. , Ueber eine Integraldarstellung von Funktionen sweier komplexer Verdnderlichen, 
Rec. Math. (Mat. Sbornik) N.S. vol. 1 (1936) pp. 851-862. 

7. , Sur les fonctions orthogonales de plusieurs variables complexes, Interscience Pub- 
lishers, New York, 1941. 

8. G. H. Hardy and J. E. Littlewood, Some properties of fractional integrals. 11, Math. Zeit. 
vol. 34 (1932) pp. 403-439. 

9. D. Hilbert, Grundsiige einer allgemeiner Theorie der linearer Integralgleichungen, Leipzig 
and Berlin, 1912. 

10. N. Nielsen, Handbuch der Theorie der Gammafunktionen, Leipzig, 1906. 

11. A. Sommerfeld Randwertaufgaben in der Theorie der particllen Differentialgleichungen. 
Enzyklopidie der mathematischen Wissenschaften, vol. II, A, 7c pp. 504-570. 

12. J. L. Walsh, Interpolation and approximation by rational functions in the complex 
domain, Amer. Math. Soc. Colloquium Publications vol. 20, 1935. 


Brown UNIVERSITY, 
PRovIDENCcE, R.I. 


; 
| 

} | 

| 
| 


g 
hi 
4 


SOME EINSTEIN SPACES WITH CONFORMALLY 
SEPARABLE FUNDAMENTAL TENSORS 


BY 
YUNG-CHOW WONG() 


1. Introduction. When the fundamental tensor(?) of a Riemannian m- 
space(*) V,, is of the form (‘) 
0 a, B, y,6,€ = 1,--+,m, 
(1.1) | i,j, k,l =1,---,m, 
0 
» 
where 
p = p(x*), o = o(x*), 
it is said to be conformally separable of the type (n, m—n); the tensors 
and with x* and x*, respectively, as parameters, are 
called its component tensors. We shall say that the tensor (1.1) is properly or 
improperly conformally separable according as 0,0 ~0, 0,0 ~0(§) are satisfied 
or not satisfied. 
The tensor (1.1) as a generalization of the ordinarily separable tensor 
[1, p. 124](*) was recently introduced by Yano [14], where he proved that 
in @ Vm with fundamental tensor 


° 7 
0 


(1.2) 


Presented to the Society, October 31, 1942; received by the editors November 24, 1941. 

() Most of the results in this paper were obtained while I was a Chinese Ying-Keng 
Funds Student visiting Princeton University, for the courtesy of whose authorities, especially 
Professor L. P. Eisenhart, I wish to express my most sincere thanks. I wish also to thank Profes- 
sor D. J. Struik of Massachusetts Institute of Technology for the conversations we had from 
time to time during the preparation of the manuscript. 

(?) Fundamental tensors are always supposed to be nonsingular, though not necessarily 
definite. All functions appearing in this paper are assumed to have differentiability properties 
adequate to the part they play in the discussion. 

(*) We denote by V, S, Ea Riemannian space. a space of constant curvature, and an Ein- 
stein space, respectively. The dimensionality is denoted, if necessary, by an index at the lower 
right-hand corner. 

(*) An index has the same range throughout this paper. An index which appears twice in 
an expression is to be summed over the appropriate range. A free index of a tensor equation 
assumes each value of its range. A numerical index at the upper right-hand corner of a letter 
means an exponential, except in the case of the coordinates x“, x* or x’. 

(®) We use the notation 0, =9/dx*. 

(*) Such a reference is made to the literatures listed at the end of this paper. 


157 


BOSTON UNIVERSITY 
COLLEGE OF LISERAL ARTS 
LIBRARY 


158 Y. C. WONG [March 


n>1, the subspaces x’ =const. are totally umbilical, if and only if *g;; 1s of the 
form [p(x*)]~* gi;(x*). He also proved that if a conformally separable tensor 
represents(") an S,, (that is, an m-space of constant curvature), then each of 
its component tensors, if it is of dimension greater than 2, represents S's. 

By definition, an Einstein space EZ is a V whose Ricci and fundamental 
tensors differ only by a scalar factor(*). The result mentioned at the end of the 
last paragraph no longer holds if S is replaced by E, although an S is a special 
E. In this paper we present a complete study of the conformally separable 
tensor which represents an E,, and each of whose component tensors either is 
of dimension less than 3 or represents E’s. It is found that the construction 
of such a conformally separable tensor is invariably reduced to that of the 
fundamental tensor g;; of an E, or a V2 for which the following equation 
admits a solution(*) for y: 


(1.3) yg = — 


where the comma denotes covariant differentiation with respect to g;;, and 
I is an unspecified scalar. We shall be content with this result, because the 
latter problem has already been considered in detail by Brinkmann [2, 3] in 
his study of E’s which are conformal to each other. 

In §2, some results concerning the differential equation (1.3) are given. 
In §3, we find the expressions for the Riemann and Ricci tensors of the tensor 
(1.1) in terms of those for the same-named tensors of its component tensors. 
Concerning a properly conformally separable tensor of the type (m>1, 
m—n=1), which we consider in §4, we prove (1) that if an E,, admits a one- 
parameter family of totally umbilical hypersurfaces, then they are conformal to 
one another and each of them has constant scalar curvature (Theorem 4.1); and 
(2) that a one-parameter family of conformal E,,’s with fundamental tensors 
[o(x*, x™) ]-2g;,(x*) can in general be imbedded isometrically in an En41 as totally 
umbilical hypersurfaces (Theorem 4.2). §§5 and 6 are devoted to the study of 
a properly conformally separable tensor *gas of the type (n>1, m—n>1) 
which represents an E,, and each of whose component tensors is either of di- 
mension 2 or represents E’s. By means of Theorem 5.1 on a certain system of 
differential equations, we show that *gas is conformal to an ordinarily separable 
tensor of the type (n, m—n) (Theorem 5.2). This result enables us to prove 
that the component tensors of *gas have the property that, either each of them 
represents E’s or S;'s, or n=m—n=2 and neither of them represents S:'s 
(Theorem 6.2). Characteristic properties of *g.s are then derived (Theorems 
6.3 and 6.4), showing how the construction of *g.s depends on that of the 


(7) We sometimes find it convenient to express the fact that g¢; is the fundamental tensor 
of an S (or E) by saying that gs; represents an S (or E). 

(*) A Vs is always an Es, and an &; is identical with an S; [13]. For convenience, we agree 
that whenever we speak of an E, it is understood that E is of dimension greater than 2. 
(*) By “solution” we always mean non-constant solution. 


1943] SOME EINSTEIN SPACES 159 


fundamental tensor of an E, or a V: which admits a solution of (1.3). 

The discussion of improperly conformally separable tensors is much easier 
and is carried out in §§7 and 8. In §9, the theorem of Yano concerning an S,, 
with conformally separable fundamental tensor is extended, and the paper 
ends at §10 with some canonical forms for the conformally separable tensors 
of the type (2, 2) which represents E,’s. 

We conclude this introduction with the following remarks. Since the 
component tensor *g;; of the tensor (1.1) can be written as 


p(x*, x’) 4-2 
4 , x0)] 


where x6 are certain fixed values of x’, there is no loss of generality in assum- 
ing that the function p is such that 


(1.4) p(x’, x0) = 1; 


in particular, if 0,o>=0, we may assume that p=1. This assumption will be 
made whenever it is desirable. A similar remark holds for the function ¢. 
Finally, the fundamental tensor of every V; referred to orthogonal coordinate 
curves is conformally separable, and for this reason we shall always suppose 
that m>2. 

I. PRELIMINARIES 


2. The equation y,;;= — Jg:;. In what follows we have frequent occasions 
to meet the following differential equation in the unknown scalar y: 


(2.1) = — h, 4, k,l= 1,--- ,n(> 1), 


where J is an unspecified scalar and the comma denotes covariant differentia- 
tion with respect to the fundamental tensor g;;. This equation has been con- 
sidered by several authors for different purposes (Brinkmann [3, pp. 121- 
124]; Fialkow [7, pp. 426-427; 8, pp. 471-473]; Yano [16]; Delgleize [4]). 
Here we confine ourselves to the case when the Ricci tensor R;; of the V, 
with fundamental tensor g;; satisfies 


(2.2) Riz = — — 1) a = a scalar, 


that is, when V, is a V3(a) or an E,(a)(**). In the latter case, (2.2) implies 
that a =const. [5, p. 93, Exercise 5]. 

We first find a geometric meaning of (2.1). The Ricci tensors R;; and 
Ri; of giz and 2;;=y~*g.; are connected by [5, p. 90, (28.6)] 


hk hk 
(2.3) Rey = Rey — — 2) + - + (n — 1) 


(*) We denote an E or S of scalar curvature a by E(a) or S(a), respectively. 


160 Y.C. WONG — [March 


From this it follows at once that 


THEOREM 2.1. Given an E, (n>2) with fundamental tensor g;;, the V, with 
fundamental tensor y~*g;; is also an E,, if and only if y satisfies (2.1). 


By definition, a V, is an S, if the Riemann tensor of V, is of the form 


(2.4) Rin = = const. 


A consequence of this is that an S, is necessarily an E,. The Riemann 
tensors Rig and Rig of gi; and 2;;=y~*g.; are connected by [5, p. 90, (28.10) ] 


1 I Al 
(2.5) = Rijn + — — 


th I 
From (2.4), (2.5) and Theorem 2.1 it can easily be proved that 
THEOREM 2.2. Theorem 2.1 remains true when E, is replaced by Sy. 


We now suppose that (2.1) has a solution y. The integrability condition of 
(2.1) is 


(2.6) a = — Vang = — eH Geel 


Transvecting this by the contravariant components g*/ of gi;, we have 


(2.7) Riya = — 11a, 
where Ri =g"'Ryx. When (2.2) is satisfied, (2.7) becomes 
(2 .8) aire = 


from which it follows that 
(2.9) I = I(y), = a(y) = dI/dy. 
By differentiating g‘’y,y,;and then making use of (2.1) and (2.9), we obtain 


= — 2I(y) 


so that 


(2.10) = — 2 f I(y)dy = — 2J(y). 


If V, is an E, or S2, a=const. For this case, it follows from (2.9) and 
(2.10) that 


(2.11) I(yy=ayt+f, = — (ay? + 2fy + 
where f and 4 are two constants. Summing up the preceding results we have 


THEOREM 2.3. Let aV, admit a solution y of (2.1). Then equations (2.9) and 


1943] SOME EINSTEIN SPACES 161 


(2.10) hold if V, is an V2; and equations (2.9), (2.10) and (2.11) hold if V, is 
an E,, (n>2) or Sz. 


To find a meaning of the constant 4 appearing in (2.11), we use (2.1), (2.2) 
and (2.11) in (2.3), and obtain 


(2.12) Riz = — (m — 3. 
This shows that 2;;=y~*g,;; is the fundamental tensor of an E,(d) or S2(d). 
Hence we have 


THEOREM 2.4. If g;; 1s the fundamenial tensor of an E,(a) or S2(a) and y 
ts a solution of (2.1), then y~*g;; is the fundamenial tensor of an E,(4) or S2(4), 
where determined from (2.11). 

We can also prove that 


THEOREM 2.5. If g:;=~y"2;; is the fundamental tensor of an Sz and y satisfies 
the equation 
1 


where the solidus denotes covariant differentiation with respect to 2;;and I is an 
unspecified scalar, then %;;%s also the fundamental tensor of an Sz. 


Proof. On account of the preceding theorem, we need show only that (2.1) 
is satisfied. Now it can be easily verified that if w; is any vector, then its 
covariant derivatives wi; and w;,; taken with respect to 2;; and gi;=y"Zi;, 
respectively, are related by 


(2.13) = Wig + (WEIGH — 
Therefore we have, from hypothesis and by (2.13), 

1 

y 


1 
y 


y* 


- + 


which shows that y satisfies an equation of the form (2.1), as was to be proved. 

Canonical forms for the fundamental tensor of an E, which admits a 
solution y of equation (2.1) have been given by Brinkmann [3]. We shall not 
enter into the detail of his results, but merely mention the main fact that the 
construction of such a canonical form depends, according as g y,; y,30 or 
=0, on the fundamental tensor of an arbitrary E,_; or on the fundamental 


| 
| 
| 

2 

| 


162 Y.C. WONG [March 


tensor of an E,_3 which contains a parameter and satisfies certain differential 
equations. 

If a V, with nonzero scalar curvature a, which may or may not be con- 
stant, admits a solution y of (2.1), then it follows from (2.9) and (2.10) that 
gy «y,;7#0. Consequently, we can show, by following Brinkmann’s method, 
that the fundamental form of V; can be reduced to 


(dx')? 
2J(x") 


(2.14) + 2eJ(x")(dx*)?, 
where e= +1 and J (x') is defined by (2.10). Conversely, if the fundamental 
form of a V3 is of the form (2.14), where J(x') is any function of x', then y =x! 
is a solution of (2.1). 

Finally, we remark that for any V, the following particular case of equa- 
tion (2.1): 


(2.15) 0 


admits a solution, if and only if V, has a family of parallel totally geodesic 
hypersurfaces, or what amounts to the same thing, a field of parallel vectors. 

3. Fundamental relations. From (1.1), it is evident that the components 
of the conformally separable tensor *gag and those of its component tensors 
*g,,and *g,, are connected by 


-2 i ; 

2 


where and are Kronecker deltas. Denoting by *I'Zs, ‘Ty, the Christoffel 
symbols of the second kind for the fundamental tensors *gas, *gi;, *gp¢, Te- 
spectively, we have [14, (3.3)] 


k k P k k 
"Ty = Tin "T's =" , = — 


(3.2) 


where 


Pp = log o; = 0; log 


= 


(3.3) 


of = *gtig;, 


The Riemann tensor of *gaa is defined by 


1943] SOME EINSTEIN SPACES 163 


If the values of *I'Z, given by (3.2) are used, then with some calculation we 
find 

= "Rijn — — 

= + ppor) — + 

= 0, 


ski 


1 hl 
= "g + peor) — + pron) |, 


where, as well as in what follows, the semi-colon denotes covariant differen- 
tiation with respect to *g;; or *gp¢, and ‘Rig is the Riemann tensor of *g;;. In 
deriving (3.4) use has been made of the following formulae: 


Tis = Tig — — + pe 15, p. 89 (28.3) ], 


(3.5) = — *g — + *g 0103), 
r 1 
— — PpPy = — o(- 
P/ spa» 

where Ij; denotes the Christoffel symbols of the second kind for g;; .The ex- 
pressions for the remaining components of *Ré,, are obtained from (3.4) by 
interchanging the two sets of indices (i, j, k, 1) and (p, g, r, s). We remark 
that (3.4) can be shown to be identical with the Gauss-Codazzi-Ricci equa- 
tions [5, pp. 162-163, (47.11), (47.12), (47.14)] for the subspaces x” =const. 

in the V,, with fundamental tensor *gas. 
The components of the Ricci tensor *Ras = *Rig, of *gas are readily found 

from (3.4) by contraction; they are 


— *Ri, = — 1)0ipp + (m — n — 1)0,0; + (m — 


1 
= + (m — 


1 
= + n(—) 


where ’R;; (’Ry,) is zero or is the Ricci tensor of *g;; (*g»,) according as 
*g:; (*g5¢) is of dimension 1 or greater than 1. 


7 
| 


164 Y. C. WONG / [March 


II. CONFORMALLY SEPARABLE TENSOR OF THE TYPE (n >1, m—n=1)(") 
4.1. Scalar curvatures of totally umbilical hypersurfaces in an E,. A 
conformally separable tensor of the type (n>1, m—n=1) may be taken as 
a, B,y,5 =1,---,m(=n-+ 1), 
,»e= +1, 
0 i,j,k, l= 1,--+,n, 


(4.1) = 


for, we may suppose that + gmm has been absorbed in o~*. From (3.6) it follows 
that the condition 


(4.2) *Rag = — (m — 1)c* gag, ¢ = const., 
for the tensor (4.1) to represent an £,,(c) is 


+ = 0, 


= — (m — 1)c*gij, 


1 1 
(m 1)p (-) + “Zam *gtig (-) (m 1)¢*gmms 
P/imm os 
where, we repeat, *g™"=1/*Zmm =e", 'R;; is the Ricci tensor of *g;;, and the 
semi-colon denotes covariant differentiation with respect to *g;; or *Zmm- 
If we write 


(4.4) = — (m — 1)(m — 2)*a(zx*, x*), 


then, by definition, *a(x*, x”) is the scalar curvature of *g;;=p~*g;;. We shall 
now prove that *a(x*, x”) is independent of x*. 

Transvecting (4.3), and (4.3)s by *g* and *g™™ respectively, and taking 
account of (4.4), we obtain 


os 1 
— (m — 1)(m — 2) *a(x*, + 


- (m — 1)%, 
(m — + (-) = — — te. 


When the latter equation is subtracted from the former, and eo? is used in 
place of *g™, the result is 


(2) In §§4.1 and 4.2 we do not confine ourselves to properly conformally separable tensors, 
but a complete discussion of improperly conformally separable tensors is reserved for §§7 and 
8 


1943] SOME EINSTEIN SPACES 


(4.6) *a(x*, — e(cpm)? = c. 
If 0, =0, this reduces to 

(4.6’) *a(x*, = ¢. 

If 0,90, then equation (4.3), can be written 


Oipm 
+o,=0, thatis, 9; log (cp,) = 0, 


Pm 
which gives 
(4.7) opm = 2(x™), 
where z is a function of x™ alone. Therefore equation (4.6) becomes 
(4.8) *a(x*, — e[z(x™)]*? = c. 


This equation and (4.6’) show that *a(x*, x) does not depend on x*, as was 
to be proved. Hence 


THEOREM 4.1. If a conformally separable tensor of the type (n>1, m—n=1) 
represents an then its first component tensor represents V,,'s of constant 
scalar curvatures. 


And geometrically(”), 


THEOREM 4.1’. If in an E,, there exists a one-parameter family of totally 
umbilical hypersurfaces, then these hypersurfaces are conformal to one another 
and each of them has constant scalar curvature. If, in particular, the family 
consists of totally geodesic hypersurfaces, then they are isometric to one another 
and their constant scalar curvatures are all equal to the scalar curvature of Em. 


The latter part of this theorem follows from (4.6’). 
4.2. An imbedding theorem. Continuing our discussion, we now write 
*a(x™) =*a(x*, x™), and proceed to prove the following 


THEOREM 4.2. In order that the conformally separable tensor (4.1) with 
0np #0 may represent an E,(c) and its first component tensor *g;;=p~*g;; E,’s 
or S3’s of scalar curvatures *a(x™), it is necessary and sufficient that when 


(4.9) p(x, x) =1 


ts assumed, the equations 


(%) A more general result can be obtained by using the Gauss-Codazzi equations of a V, in 
an E,4:. Indeed we can prove that tf an En, has a totally umbilical hypersurface Vn, then the 
scalar curvature of V, ts constant. But we shall not go farther with this result, because Theorem 
4.1’ serves only as a preliminary to the imbedding Theorem 4.2’. 


if 
a 
4 
4 

4 
4 
\ 


166 Y. C. WONG 


(4.10) Rig = — (m — 
(4.11) ™ Sits 
(pm)? 


(4.12) co? = *a(z") 


be satisfied. 


Here we write (4.11) to mean the equation p,;;= —wg;;, where w is an un- 
specified scalar. This notation will be used whenever desirable; it enables us 
to avoid the unnecessary introduction of many symbols to represent scalar 
factors of proportionality. 

We know that in a V,, the hypersurfaces x™=const., whose (first) funda- 
mental tensors are nonsingular, are totally umbilical if and only if the funda- 
mental tensor of V,, can be reduced to the form (4.1) [5, pp. 144, 182]. We 
also know from Theorem 2.1 that when »>2 and (4.9) is satisfied, equations 
(4.10), (4.11) are the conditions that 


(4.13) ‘Rig = — — 1)*a(x™) 


that is that *g,;;=p-*g;; represents E,’s. Accordingly Theorem 4.2 may be 
stated geometrically as follows. 


THEOREM 4.2’. Let ¢ be any constant and let *g;;= [p(x*, x™) ]—*ge3(x*) with 
x™ as parameter and 0,p*0 represent ~' V,'s whose scalar curvatures are not 
all equal to c. If the V,’'s are E,,’s (n>2), there exists an E,4:(c) in which they 
are imbedded isometrically as totally umbilical hypersurfaces. If the V,’s are S3’s, 
a necessary and sufficient condition that they may be imbedded isometrically in an 
S3(c) as totally umbilical surfaces is that, when p(x*, x>)=1 ts assumed, the 
tensor p,x; differ from gi; by a scalar factor. The fundamental tensor of the im- 
bedding En1:(c) or S3(c), if it exists, is 


0 ] 
Sap = ’ 
(8, log p)*/[*a(x™) — 
where *a(x™) denotes the scalar curvatures of the given E,'s or S3's. 


We shall now prove Theorem 4.2. Since by hypothesis 0,9 #0, equations 
(4.6), (4.7) and (4.8) are consequences of (4.3), as we have seen in §4.1. If 
we solve (4.6) for eo~*, the result is (4.12). Hence Theorem 4.2 will be proved 
if we can show that in conseauence of() (4.7), (4.8), (4.9) and (4.13), equa- 
tions (4.3) reduce to (4.11). 
be Now equation (4.3); is satisfied because of (4.7). When (4.13) is used, 
equation (4.3): becomes 


(4) We observe that equations (4.6)—(4.8) and (4.12) are unaffected by the supposition (4.9). 


[March 


1943 SOME EINSTEIN SPACES 


] 
- {- (m — 1)¢+ — 1)*a(x™) 


(2), Je 


which is evidently equivalent to (4.5); and 


1 
(4.14) ~ 


oS 
But equation (4.5); is a consequence of (4.7), (4.8) and (4.3)s, as is evident 
from the way in which (4.7) and (4.8) were derived. Thus, because of (4.7), 
(4.8) and (4.13), equations (4.3) are equivalent to (4.14) and (4.3)3. In what 
follows we shall reduce the latter two equations successively to (4.16), then 


to (4.20), and finally to (4.11). 
Comparing (4.14) and (4.13) with (2.1) and (2.2), it follows from (2.11), 


that equation (4.14) can be written 


(4.14’) (-) - [toce=) — + 


where w(x”) is a function of x” alone. In virtue of this, equation (4.3)3 be- 
comes 


(4.15) + ¢ — *a(x™") — ow(x™) = 0. 


Now 


=(- An log + (1/2)8m log 
p p 


p 


p opm 2(x™) 


by (4.7), where and in what follows, the prime denotes differentiation. Sub- 
stituting the above expression in (4.15) and then using (4.7), (4.8) and the 
equation obtained by differentiating (4.8), we find 


(4.15) 


 g 

| 

p 

{ 

p p af 

‘ 

| 


168 Y. C. WONG 


Because of this, equation (4.14) becomes 


This is an equation which, because of (4.7) and (4.8), is equivalent to (4.14) 


and (4.3)s. 
We now express (4.16) directly in terms of p and g;;. To do this we make 
use of (4.7) and the following formula, which can easily be proved by means 


of (3.5): 


1 1 /p Psi 1 
os PNG? os 


where as usual the comma denotes covariant differentiation with respect to 
giz. Equation (4.16) then becomes 


[— *a(x™)om + (1/2)*a’(x™) + 864 


.18 
= p(Omp),sj — = p?Om 
p 


because (mp) ,4j=9m(p,i;). On account of (4.9), the scalar curvatures a = *a(x?) 
and *a(x™) of gi; and p~*gi; are connected by [5, p. 90, (28.7) ] 


2 
(4.19) *a(x™) = ap? + — 


By use of this equation and its partial derivative with respect to x, we can 
easily verify that the coefficient of g;; in (4.18) is 


1 
ry g** [p(Omp) (Omp)p,rx]- 


This shows that, when (4.9) is supposed, equation (4.18), and hence also 
(4.16), are equivalent to 


(4.20) 


Finally, to reduce this to (4.11), we integrate it with respect to x” and 
obtain 
(4.21) — + Tis ~ 
p 


where 7;; is an integration tensor independent of x”. Now it follows from 
the very definition of partial differentiation that for any function ¢(x*, x”) 
of x* and x”, 


[March 

(2) ~ tus 


SOME EINSTEIN SPACES 


[ad(x, x") = x0). 


Therefore, in consequence of (4.9), we have (p,;)zmast=0, (0,:j)mac¢=0, and 
whence, if we put x™ = in (4.21), the result is 7;;~g,;. This shows that (4.21) 
and hence also (4.20) are equivalent to (4.11). The proof of our theorem has 
thus been completed. 

Added in proof. In ccnnection with Theorem 4.2’ I may mention that in 
a forthcoming paper of mine [17] a necessary and sufficient condition is ob- 
tained for a V,, to be imbeddable in an Z,4; as a member of ! totally umbili- 
cal hypersurfaces. There Theorem 4.2’ appears as a corollary to a more 
general result, and all the conformal-Euclidean V,’s which satisfy this con- 
dition of imbeddability are determined. 


III. PROPERLY CONFORMALLY SEPARABLE TENSORS 
OF THE TYPE (n>1, m—n>1) 


5.1. An auxiliary theorem. In this section we shall consider the con- 
formally separable tensor 


a,B,y,6=1,--+,m, 


0 
(5.1) tea =[? | i,j,k, =1,--+,m, 
O 
2,947, s=n+1,--+,m, 
where 


= Bre = = p(x*), o = o(x*), Io 
For (5.1), we have (cf. (3.6)) 
— *Rip = (m — 1)dipp + (m — — + (m — 


1 
= ‘Rij + (m — n)o (-) 


1 
= "Ree + np (-) 


where the signs (,) (’) indicate, respectively, the covariant differentiation and 
the Ricci tensor referred to *g;; or *gp~¢. We suppose as usual that 


(5.3) p(t, %) =1, 2) = 1. 


1943] 169 

i 

4 

| 


170 Y. C. WONG ‘ [March 


To establish our main result, Theorem 5.2, in the latter part of this section, 
we need the following auxiliary 


THEOREM 5.1. For the properly conformally separable tensor (5.1) of the 
type (n>1, m—n>1) with (5.3) satisfied: 
(i) The system of equations 


(5.4) — = (nm — 1)dipp + (m — n — 1)0,0; + (m — 2)py0; = 0 
ts equivalent to 
p= p(y,2), =a(y,2), 
{5.4’) p(y, %) = 1, o(yo, 2) = 1, 
(nm — + (m — — 1)d,0, + (m — 2)po, = 0("4), 
where y=y(x*"), s=2(x") are any functions of the arguments indicated, and 


Yo=y(x), o=2(x5). 
(ii) If the tensors gi; and gy, are considered as given, the following system of 


equations in the unknown functions p(x*) and o(x*): 

(5.5): *R,;, = 0, 

(5.5)s — ~ * Rog — ~ 
is equivalent to the system of equations consisting of (5.4’) and 


in the unknown functions y=~y(x*), s=2(x"), p=p(y, 2), c=o(y, 2), where J 
and Q are any functions of the arguments indicated. 


5.2. Proof of Theorem 5.1 (i). Equation (5.4) can be written 
— (m — 2) ps0: = — 1) log p + (m — — 1) log a]. 


Differentiating this partially with respect to x* and then taking the alterna- 
tion in the indices p and q, we get(*) 


— (m — 2) (0:8 + = 0, 


(5.5’) 


that is, 
(S.6)e = O. 


When the value of 9,0; from (5.4) is substituted (which is possible because 
m—n>1), this becomes 


Here we write 0, =9/dy, pp log p, log o. 
(4) We write, for example, 3; = 9 gop — App 


J(y), 


1943] SOME EINSTEIN SPACES 


— — + (m — 2)pqi0:] = 0, 
which, since » >1, reduces to 
= O. 
This shows that a function 0(x*, x") exists such that 
= = (0; log 4)pp, 
and hence 
Pp = O(x*, x*)w,(z*), 
where w,(x*) are m—n functions of x" alone. Now from the very definition of 
partial derivative, we have that for any fixed values xf of x-, 
Therefore, if we write s=p(xj, x”) and remember that p»=90, log p, then it 
follows from the two preceding equations that 


2°) Pps 


a, log = = 


that is, 


x , x 
——— 9, log z, 
2°) 


which shows that p can be expressed in terms of x* and z alone; thus, 
(S.7)s p = p(x*, 2). 


Since (5.4) as well as the hypothesis following (5.1) remain the same when 
p, 0; ", m—n; 1, p are interchanged, we have, by symmetry, 


(S.7)» o = o(y, x"), 


where y is defined by y=o(x*, xj). 
From (5.7) we have 


where p, =9, log p, oy=0, log o. Using these in 
(S.6)p = 0, 
which is the symmetric expression of (5.6),, we find 
9 = O. 


Since by hypothesis 0,90, 0,00, so that 0,0,20, the above equations 
are equivalent to 


171 


Y. C. WONG ‘ [March 


(99) 9 = 0. 
From this it follows that p, can be expressed in terms of x’ and y alone. But 
on the other hand, (5.7), shows that p, can be expressed in terms of x* and z 
alone. Therefore p, is a function of y and z alone, and hence p must be of the 


form 
p = F(x*)$(y, 2). 


Taking (5.3) into account, we have 
x") F(x*)o(y, 2) 
p(x*, x) F(x*)o(y, z,) 


where =3(xg). Thus 

(S.8)a =p(y,2), p(y, = 1 
and by symmetry, 

(5.8)p o = o(y, 2), o(yo,z) = 1. 


Now for p and a of the form (5.8), equations (5.4) become, after omitting 
the non-vanishing factor (0;y)(0,2), 


(5.9) (n 1)dyps + (m 1)d,0y + (m = 2) psy = 0. 


Equations (5.8) and (5.9) are identical with (5.4’), and therefore Theorem 
5.1 (i) is proved. 

Remark. For any V,, with fundamental tensor (5.1), *R;,=0 is the con- 
dition that there be m independent congruences of Ricci curves of V,, lying 
in the subspaces x’ =const. Hence from (5.8) we have incidentally: 

If, ina Vm with a properly conformally separable fundamental tensor of the 
type (n>1, m—n>1), the subspaces x" =const. contain n independent congru- 
ences of Ricci curves of Vm, then the o™-" subspaces x*=const. and the o* 
subspaces x*=const. consist of ~' families of o™-"—' isometric V,’s and «1 
families of isometric Vn—n’s, respectively. 

5.3. Proof of Theorem 5.1 (ii). By Theorem 5.1 (i), equation (5.5)), 
which is identical with (5.4), is equivalent to (5.4’). If 


y= 2 = 2(’s) 
is any nonsingular transformation from y, z to ‘y, ‘z, then (5.4’) become 
p= ply(’y), 2(’2)], = o[y(’y), 2(’2)], 
(5.4”) ply(’y), 2(‘zo) | 1, 2(‘z) | 1, 
(n — 1)O-ypre + (m—n— + (m — 2) = 0, 


where ‘yo, ‘so are any roots of the equations y(’y)=‘o, 2(’s)=29. Hence 


172 | 


1943] SOME EINSTEIN SPACES 173 


Theorem 5.1(ii) is true if a suitable nonsingular transformation y=~y(‘y), 
z=2('s) exists such that equations (5.5): reduce to 


(2) = '0('s), (=) = ‘J('y), 


(5.5) 
™ 

We now proceed to prove that this is the case. 
By (5.2)2, equation (5.5), is equivalent to 


1 
6.5" 


When the covariant derivative (1/c),;; taken with respect to *g;;=p-*g,; is 
expressed in terms of the covariant derivative (1/c),;, taken with respect to 
equation (5.5’’’)s, becomes (cf. (2.13)) 


1 1 1 
(5.5) 24 (-) + pi (-) + i(—) ~ 


By (5.4’):, p and o are functions of y and z, and therefore 


When these are used in (5.5'”)e,, the latter becomes 


(dy)? 
[2 -—— -2 ~ Bin 
p 


o 


which, because 0,0 #0 by hypothesis, can be written 


p*0yo 


(5.10) + (a log 


A consequence of this equation is 


p*0yo 
(5.11) 0, log function of y alone. 
For, if we write (5.10) as 


+MY, ™ Bir 


1 ayo (dye)? 
| 
CO} 
Oyp | 
| 
| 


174 Y. C. WONG 


and eliminate y,;; from it and 
HAY, BDI Bis 
where 3; is a constant, we find 
[A(y, 2) — 23) 


Since gi; is of rank greater than 1, the coefficient of y,; y,; in the above equa- 
tion must be zero; thus A(y, 2) =A(y, 2:1), which proves (5.11). 
This being the case, we have by integration of (5.11) that 


20, 
(5.12). = 10900). 


In like manner we derive from (5.5) that 


(5.12) PO). 
Here J, J, P, Q are some functions of the arguments indicated. 
Now consider the functions ’y, ’s introduced (to within integration con- 


stants) by 
(5.13) = f = f Pas 


Since 8,90, 0y0*0 by hypothesis, it follows from (5.12) that neither I(y) 
nor P(z) can be identically zero. Consequently, (5.13) define a nonsingular 
transformation, which evidently carries (5.12) into 
(5.14); —— = = 

Finally, if we recall the way in which (5.10) was derived from (5.5)s,, it will 
at once become obvious that the expression for (5.5)3, in terms of ‘y and ’z is 
obtained by replacing y by “y in (5.10); that is, (5.13) transforms (5.5), into 


~ 


+ (2. log 


In consequence of (5.14) 1a, this becomes 

Similarly, in terms of ’y and ‘z, equation (5.5) becomes 
(5.14) ™ Soe 


Equations (5.14) are identical with (5.5’’),, which proves our theorem. 


[March 


1943} SOME EINSTEIN SPACES 175 


Remark. From the above proof, it is easily seen that equations (5.5),, 
(5.5)sa, by themselves, are equivalent to (5.4’), (5.5’)1a, (5.5’)2a; and equa- 
tions (5.5):, (5.5), by themselves, to (5.4’), (5.5’)m, (5.5’)s. 

5.4 An important property. We are now ready to prove the following 


THEOREM 5.2. If a properly conformally separable tensor *gag of the type 
(n, m—n) represents an E,, and each of tts component tensors either is of dimen- 
ston 2 or represents E,,’s, then *gagis conformal to a separable tensor of the type 
(n, m—n). 

Proof. By supposition, we have 
(5.15) *Rag = — (m — 1)c *gas, ¢ = const., 
(5.16). = — (m — 1)*a(x*, 

(5.16), 'Rog = — (m — n — 1)*b(x*, 2°) *g 


As a consequence of these equations, equations (5.5) are true. Thus by Theo- 
rem 5.1 (ii), two functions y=y(x*), z=2(x*) exist such that the following 
equations are satisfied : 


p(y, %) = 1, = 1, 
(m — 1)0yp. + (m — n — 1)0,c, + (m — 2)p,0, = 0. 


(5.18) = Q(2), (=) 20 = J(y), 


Our theorem will be proved if we can show that as a result of (5.16), 
(5.17) and (5.18), the function p/¢ is of the form ‘p(y)/‘e(z). We treat the 
two cases n>2, m—n>1 and n=m—n=2 separately. 

Case 1.n>2,m—n>1. 

Since » >2, we have by supposition that p~*g,;; represents E,’s. Therefore 
it follows from (5.16), and (5.17):, that *a(x*, x") =*a(z). On account of 
(5.17)2a, giz is the fundamental tensor of an E,, whose scalar curvature is 
a=*a(zo). Thus, by Theorem 2.1, equation (5.16), implies that 


p=p(y,2), =a(y, 2), 
(5.17) 


(5.19) ™ 
In virtue of (5.17)1., this can be written 

(Oye) + 9.49.5 ~ 
which becomes, because of (5.18)s., 


. 
he 
+ 
H 
| 
{ 
W 
i 
i 
} 
i] 
a 
| 
i 
ij 
|! 
il 


176 Y. C. WONG , [March 


Since g;; is of rank greater than 1, it follows from this that 0,0,o =0, whence 
(5.20) p = U(z)y + V(z). 
Using this in (5.18)1, we have 
if U=0, 
= —QO(Uy+V)? = 
(=) U #0. 


Integrating these with respect to y and then making use of (5.20), we find 
that for both cases p/¢ is of the form 


(5.21) = = W(s)y + Z(2). 


The U, V in (5.20) and W, Z in (5.21) are all functions of z alone. 
If either W or Z is identically zero, p/o will be of the form ‘p(y)/‘o(z), and 


our theorem is proved. 
Now suppose that neither W nor Z is identically zero. When (5.20), (5.21) 
are used in (5.17)3, the latter reduces to 


(m — 2)(U'Z — V'W)(Wy + Z) 


(5.22) 
+ (m — n — 1)(WZ' — W'Z)(Uy + V) = 0, 


where the prime denotes differentiation. From this it can be proved that 
(5.23) W2' — = 0. 


Assume that this is not true. Then since W+#0, Z0, we have from (5.22) 
that 


U/W = V/Z = X(z), 


where X #0, otherwise, p=0 by (5.20). Now it is easily verified that in con- 
sequence of the above equations, (5.22) becomes (n—1)(WZ’ — W’Z)X =0, 
which cannot be satisfied. Thus (5.23) is true, and consequently W and Z 
differ by a constant factor. Hence it follows from (5.21) that p/¢ is the form 
‘o(y)/"e(s), and the proof of Theorem 5.2 for the case »>2, m—n>1 is 
completed. 

Remark. We observe that in the above proof we made use only of equa- 
tions (5.16),, (5.17), (5.18)1., (5.18)2., but not of (5.15), (5.16), (5.18), 
(5.18). Now by the remark at the end of §5.3, equations (5.17), (5.18)1., 
(5.18)2, are equivalent to (5.5)1, (5.5)2., namely, 


Thus our conclusion that p/o is of the form ‘p(y)/’o(z) is in fact a conse- 


1943] SOME EINSTEIN SPACES 


quence of the following equations: 
*Rip= 0, * Rig ~ * 

Hence: 

If, at each point P, coordinates x{, of @ Vm with properly conformally sepa- 
rable fundamental tensor *gag of the type (n>2,m—mn>1), every direction in the 
Vn: x" =x; is a Ricci direction both of V, and of Vn, then *gag is conformal to a 
separable tensor of the type (n, m—n). 

Case 2.n=m—n=2. 

In this case, equation (5.19) is in general not true, and we shall base our 
proof of Theorem 5.2 on (5.17) and (5.18) alone. Equation (5.17); is now 
equivalent to 


(5.24) OyPs + psty = + = — w, 
where w=w(y, 2). If w=0, then since p,~0, (5.24), can be written 
0, log (ap,) = 0, 


which gives us, on integration, 


Ops that is, 


~ 
Comparison of the last equation with (5.18), shows that ¢/p=‘e(z)J(y), 
which proves our theorem. 

We now suppose that w~0Oand always bear in mind that J = J(y),Q=Q(z). 
Then on account of (5.18):, equation (5.24), can be written 


Jp p Jpf/J'’ Qe 
that is, 


1 J’ Qe 
(5.25) (-) == 
pJ 
In like manner, (5.24), can be reduced to 


(5.25) 


We now find the integrability condition 0,0,(1/p) =0,0,(1/p) for (5.18) 
and (5.25),. Differentiate (5.25), with respect to z and we have 


+4 4 2wo? (-)] 
p p p 


(5.26) 


17 
1 1 
wo? i 
Jp? 
Qa,e Qo JO 
—— = — —+—- 
ope 
J Lp? 
| 


178 Y. C. WONG 


On the other hand we have from (5.18); 


(5.27) a,(—) = - 


Using (5.27); and (5.25), in (5.26) and comparing the result with (5.27), we 
find 


p p 


(ah 


pe J o? 


that is, 
20’ wp? 
Q 


This is the integrability condition we wished to establish. But because of 
(5.18):, (5.24), we have 


(5.28) 


wp? w 
— = — = — 4, log (pe,). 


oQ Gy 
Therefore (5.28) becomes 


which, by (5.18)1., can be written 
(5.29), 0, log 0. 
wo 


Following a procedure symmetric to the above one, we can prove that 


(5 Oy log —=0. 
wp? 


From (5.29) it follows at once that 


(5.30) 8,0, log — = 0, 


[March 
1 J’ ayo J’ 20 
= — + — = - 
J Le? On? pe 
which simplifies into 
.. 
Jp 


1943] SOME EINSTEIN SPACES 179 


which show that p/g is of the form ‘p(y)/’o(s). Thus the proof of Theorem 5.2 
is completed. 

Remark. The above proof for the case »=m—n=2 holds also for the 
more general case n»=m—n>1, but not for other cases. Indeed, when n>1, 
m—n>1, equation (5.17); may be replaced by (cf. (5.24)) 


(3) 


and if we carry out on these equations a procedure similar to that which we 
did on (5.24) for the case w x0, then the final result corresponding to (5.30) is 


n—1 m—-n—1 
+4) tog» - log = 0. 


1 
This reduces to (5.30) when and only when » = m—n, which proves our asser- 
tion. Moreover, since our proof depends only on (5.17), (5.18), which are 
equivalent to (5.5), we have 
If the fundamental tensor *gap of @ Von (m>1) is a properly conformally 
separable tensor of the type (n, n) and the Ricct tensors *Ras, ‘Rij, 'Rpq of Von 
and its subspaces x" =const. and x* =const. satisfy the relations: 


= 0, — ~ — ‘Roe ~ 


then *g.s 1s conformal to a separable tensor of the type (n, n). 

6.1. Main results. Theorem 5.2 enables us to bring to a satisfactory con- 
clusion our study of a properly conformally separable tensor which repre- 
sents an E,, and each of whose component tensors either is of dimension 2 or 
represents E’s. 

Consider the properly conformally separable tensor 

(6.1) = i,j,k, L=1,-++,m, 
Boe 


where 
n> 1, 


(6.2) = r(x*), 0, 0,7 0, 
Bis = ng = 


Let quantities referred to gag, 2:3, or Zp, and covariant differentiations taken 
with respect to them be marked by the signs (’’), (—); (.), (/). Then we have 
(cf. (3.2), (3.3)) 


if a, 8, y are not all in the same range; and consequently, 


} 
| 
| 
} 
4 
| 
; 
| 
| 
| 


180 Y. C. WONG 


(6.3) = = Their T: pq = 


(6.4) "Rip = 0, "Ri; = Ri;, “Re = 
Also from (6.1) we have [5, p. 90 (28.6) ] 


(6.8) *Rap = "Rap — (m= 2) + — — 1) 
T 


72 
Now suppose that (6.1) represents an £,,(c), that is, that 
(6.6) *Rap = — (m — 1)c* gas, = const. 


In consequence of this and equations (6.1)—(6.4), equation (6.5) for (a, 8) 
= (i, p) becomes 0,0;r =0, and therefore("*) 


(6.7) = y(x*) + 2(2’), 
where y#const., z#const. because of (6.2). Hence (6.1) may be written 


If and the component tensor *g;;= represents E,,’s 
(S,’s), then gi; represents an E, (S,), because *g;; becomes g;; for z=0. Con- 
sequently, we have by Theorem 2.1 


+2 1 
(6 -9) ( ) ™ 8iis that is, (-) ™ iis 
¥ 


where, as usual, a comma denotes covariant differentiation with respect to 
giz Thus by Theorems 2.1 and 2.2, the tensor 4°g;;= Z;; is the fundamental 


tensor of an E, (S,). 
We now suppose, besides (6.7), also that each of the component tensors 


of the tensor (6.8) either is of dimension 2 or represents E’s. Then it follows 

from the above observation that 

(6.10). Ris = — (m — 

(6.10), Roe = — (m — — 

where 4(x*) is constant when m>2, and 5(x*) is constant when m—n>2. On 

account of (6.3), (6.4), (6.6), (6.7), (6.9) and (6.10), equation (6.5) for (a, 8) 

=(i, j) and (p, g) become 

(6.11). (m — = — (m — 1)d(x*) — (m — 


(**) We note that these y, z are not identical with the y, s which appeared in §§5.1-5S.4. 


[March 


1943] SOME EINSTEIN SPACES 181 
— (m — = — (m — — 1)B( x") Bog — (m — 2) 

From these it follows at once that 
(6.12) = — = — 


which, because of (6.10), imply (cf. Theorems 2.3 and (2.9), (2.10)), respec- 
tively. 


(6.13). = — 2M(y), L=L(y)= M'(y), a(x*) = a(y) = M(y), 
(6.13), 279% = — 2T(2), S=S(z)= T(z), 5(x*) = = T’(2), 
where the prime denotes differentiation. In consequence of these, equations 
(6.11) are equivalent to (6.12), (6.13) and 
— (m — 1)cr-* + (nm — 1)M” — (m — 2)M'r“ 
(6.14) = — (m — 1)cr-? + (m — nm — 1)T” — (m — 2)T'r“ 
= [nM’ + (m — n)T’ |r — 2(m — 1)(M + T)r~. 
These last equations together with (6.7) can be solved for M(y) and T(z). 
Indeed, the first equation of (6.14) can be written 
(6.15) (m — 2)(M’ — T’) — (y+2)[(n — 1)M” — (m — n —1)T”] = 0. 
Differentiating this partially with respect to y twice, we find 
(m — 2n)M"" — (n — 1)(y + 2)M™ = 0, 
which, because M is a function of y alone, gives us M'¥=0. Hence 


(6.16). M(y) = ao + ary + aay? + assy’, 


(6.11). 


and by symmetry, 
(6.16), T(z) = bo + bys + bez? + 
where the a’s and b’s are constants. When these values of M and T are sub- 
stituted in (6.15), the latter reduces to 
bs = a3;=>0 if m+ 2n, 
= if m= 2n. 


Now in consequence of (6.16) and (6.17), equations (6.14) become 
(6.18) ¢ = 2(a0 + bo), 
(6.19) a3=0 if m= 2n = 4 are not satisfied. 


Thus, the solution of equations (6.14) is given by (6.16)—(6.19). 
By Theorem 5.2, the preceding results prove the necessity of the condition 
in the following 


| 
| 
| 


182 Y. C. WONG . [March 


THEOREM 6.1. In order that a properly conformally separable tensor *gas of 
the type (n>1, m—n>1) may represent an E,, and each of its component 
tensors either be of dimension 2 or represent E’s, it is necessary and sufficient 
that *gag be of the form (6.8) and equations (6.10), (6.12), (6.13), (6.16)—(6.19) 
be satisfied. 


The sufficiency of the condition in this theorem can be proved as follows. 
If n>2, (6.10), shows that 2;; represents an E,, and consequently by (6.12)., 
the tensor g;;=y~*Z;; also represents an E, (Theorem 2.1). Thus equations 
(6.9) are satisfied, and accordingly the component tensor *g;;= [(y-+s)/y}*g4; 
represents Z,.’s. Similarly, it follows from (6.10), and (6.12), that if m—n>2, 
the component tensor *g,, represents E,,_,’s. Finally, from the way in which 
equations (6.10), (6.12), (6.13), (6.16)—(6.19) were derived, it is readily seen 
that if these equations are satisfied, equation (6.6) must also be satisfied. 
Hence *g.g represents an E,, and our theorem is completely proved. 

Now we are ready to establish the following three main results. 


THEOREM 6.2. If a properly conformally separable tensor represenis an En 
and one of its component tensors represents E’s or So's, then the other com- 
ponent tensor, if it is of dimension 2, represents S3’s. 


THEOREM 6.3. In order that a properly conformally separable tensor *gas 
may represent an E,, and each of its component tensors E’s or Sq's it is necessary 
and sufficient that the following conditions be fulfilled: 

(1) *gag is of the form 


y = y(x*), = 2(x"), 


(y+ 


0 = Bii(x*), Zoe = Bq(x"). 


(2) The tensors 2:3, Bpq each represent an E or Sz with scalar curvatures 


4, b connected by 4+56=0. 
(3) The equations 


= — (G9 + = — + 


are satisfied with a constant f. 

If these conditions are fulfilled, the scalar curvature c of *gag is equal to the 
sum of the scalar curvatures of y~*g;; and 2~*Z,, (each of which, as 1s implied by 
(2) and (3), is the fundamental tensor of an E or S:). 


THEOREM 6.4. In order that a properly conformally separable tensor *gas 
of the type (2, 2), whose component tensors do not represent S,'s, may be the 
fundamenial tensor of an Ey, it is necessary and sufficient that (1) *gas be of the 
form 

0 
+ i} A = const., 


1943] SOME EINSTEIN SPACES 


and (2) the equations 
= — (1/2)(* + = — (1/2)(B? + B)E ne 


be satisfied with a constant B, where 4, b are the scalar curvatures of the funda- 
mental tensors 8:3, 


The proof of these theorems will be based on Theorem 6.1. For Theorem 
6.2, we suppose for the moment that m—n=2 and that if n=2, *g;; repre- 
sents S;’s. Theorem 6.2 will be proved if we can show that in this case 
a3=0. Indeed, if ag=0, then by (6.17), bs=0, and consequently, by (6.13), 
and (6.16), 5(x") = 7’’(z) =2b:=const. Therefore Z,, represents an S. Since 
(6.12), can be written as (y+3);,,= —SZ,,, the component tensor *g,, 
= (y-+2)-*Z,, represents S,’s. It remains therefore to prove that a3=0. 

If n>2, as=0 is given by (6.19) without further proof. If n=2, we have 
by hypothesis that *g;;=(y+2)-*Z;; represents S,’s, which implies that the 
tensor g;;=y~*Z;; also represents an S;. Therefore, in consequence of (6.12)., 
&:;= "gi; is the fundamental tensor of an S,; (Theorem 2.5). Hence we have 
from (6.13), and (6.16), that 4(x*) = M’’(y) =2a2+6asy=const. From this it 
follows that a;=0, as was to be proved. Theorem 6.1 has thus been com- 
pletely established. 

As a consequence of Theorem 6.2, for a properly conformally separable 
tensor which represents an E,, and each of whose component tensors either 
is of dimension 2 or represents E’s, only two cases can happen: either (1) each 
of its component tensors represents E’s or S;’s, or (2) m=2n=2 and neither 
of them represents S3’s. They are the two cases which we deal with in Theo- 
rems 6.3 and 6.4. For them, we have, respectively, 


M(y) = ao + ary + any’, T(z) = bo + biz + 5:2, 


(6.20) 
a, = by, a2 + bz = 0; 


M(y) = do + airy + + asy’, T(z) = bo + bis + bos? + 
a; = by, b. = 0, a; = b; 0. 


(6.21) 


Now for case (1), the scalar curvatures of the fundamental tensors 3;,;, 
=F pg are, respectively d= 2a2, b=2bs, 2ao, 2bo, as follows 
from (6.13) and Theorem 2.4. Hence Theorem 6.3 is proved by (6.18) and the 
equations obtained by using (6.20) in (6.12) and (6.13). 

Finally, to prove Theorem 6.4, we use (6.21), (6.13) in (6.12) and get 


(6.22) = — (G1 + + = — (G1 — + 


Since @=2(a2+3asy), b=2(—a2+3az) by (6.13), and a30, equations (6.7) 
and (6.22) can be expressed in terms of 4, 5. The result is readily found to be 


(6.23) r= A(a+ 5), 


183 
4 
a 
fal: 
aff 
a 
i 


184 Y. C. WONG . [March 


(6.24) = — (1/2)(4 + B)Biis Bing (1/2)(5? + 


where A =1/(6as), B=4(a3—3a,a3). This completes the proof of Theorems 
6.2-6.4. 
IV. IMPROPERLY CONFORMALLY SEPARABLE TENSORS 


7.1. A preliminary theorem. The improperly conformally separable tensor 
with p=1 is 
a, B,y,¢€=1,---,m, 


(7.1) yore * 


where 
o=o(x*), giz =gii(x"), Spe = 


We suppose throughout this section that 0,00, that is, that *gas is not 
separable in the ordinary sense. For the tensor (7.1) we have (cf. (3.6)) 


*Rip = — (m — n — 1)0,0;, 


1 
Ri; = Ri; + (m me(—) 


o 


(7.2) ; 


where the comma denotes covariant differentiation with respect to g;;. Let 
(7.1) represent an E,,(c), so that 


(7.3) *Rag = — (m — 1)c* gag, ¢ = const. 


On account of this, (7.2): becomes 
(m 1)d,0; = 0. 


Hence 


THEOREM 7.1. If the improperly conformally separable tensor (7.1) repre- 
sents an En, then either m=n-+1 or o is of the form 2(x*)/y(x*). 


We discuss these two cases separately. 
7.2. The case m=n-+1. Since +gmm may be absorbed in o~*, (7.1) may 
be written 
«CO 
(7.4) = i} e= ti. 


0 eo? 


For this case ‘Ram =0, and, in consequence of (7.2), equation (7.3) is equiva- 
lent to 


* on 


SOME EINSTEIN SPACES 


1 
Ris + = — 


fi 
(-) =— ne. 


If n=2, we have shown (Theorem 4.1) that g;; represents an S;. We now 
consider the more general case when »>1 and the component tensor g;; is 
such that 


(7.6) Riz = — (m — 1)agi;, @ = const., 


(7.5) 


that is, that gi; represents an E, or an Se. Because of (7.6), equation (7.5): 
becomes 


Transvecting this by g and comparing the result with (7.5)2, we find that 
(7.8) a=c. 


On account of this, equation (7.7) becomes 

a 
(7.9) = 

Since (7.5)2 is evidently a consequence of (7.8) and (7.9), the latter equations, 
because of (7.6), are equivalent to (7.5), and hence to (7.3). Thus we have 
proved the following 


THEOREM 7.2 (i). Let gi; be the fundamental tensor of an E,(a) or S:2(a). 
Then in order that the improperly conformally separable tensor (7.4) of the type 
(n>1, m—n=1) may represent an En, it is necessary and sufficient that gi; 
and o satisfy equation (7.9). 


Since by supposition g;; represents an E,(a) or S2(a), it follows from (7.9) 
that the tensor o*g;; represents E,.’s or S2’s (Theorem 2.1). If, in particular, 
a=0, then (7.9) reduces to (1/¢),;;=0, and therefore the E,(0) or S2(0) with 
fundamental tensor g;; has a parallel vector field. Hence 


THEOREM 7.2 (ii). If the improperly conformally separable tensor (7.4) and 
its first component tensor g;; represent an E,,(c) and an E,(a) or S2(a), respec- 
tively, then the tensor o*g;; represents E,’s or S2's and c=a. If c=a=0, then 
the E,,(0) or S2(0) with fundamental tensor g;; possesses a parallel vector field. 


Conversely, let g;; be the fundamental tensor of an E,(a) which is con- 
formal (!”) to another E,, then the equation 


(") An E, with fundamental tensor g;; is said to be conformal to another Z, if a non- 
constant scalar y exists such that y~*g;; is the fundamental tensor of an Ep. , 


1943] 185 
| 
| 
q 
H 
a 
| 


Y. C. WONG [March 


= — (ay + frais, f = const. 


has a solution for y (cf. Theorems 2.1, 2.3 and (2.11)). Thus, if a0, the func- 
tion 1/¢=y+//a evidently satisfies (7.9). Hence we have the following con- 
verse to Theorem 7.2 (ii): ° 


THEOREM 7.2 (iii). Given an E, whose scalar curvature is zero and which has 
a parallel vector field, or one whose scalar curvature is not zero and which is con- 
formal to another E,, then the given E,, can be isometrically imbedded in an 
En+: as a member of ~' isometric, non-parallel, and totally geodesic hypersur- 
faces. 


The present case has already been considered with a different method by 
Fialkow [7, 7’], and the results stated in Theorems 7.2 (i), (ii) and (iii) are 
due to him. However("*), he overlooked the exceptional case Z,(0), for which 
the property of its being conformal to another E, is not a sufficient condition 
for it to be imbeddable in an Z,,; in the manner stated in Theorem 7.2 (iii). 

7.3. The case m—n>1. By Theorem 7.1, in this case o must be of the 
form 2(x”)/y(x*). Since 2(x’) may be absorbed in g,,, there is no loss of gen- 
erality in assuming that ¢=1/y(x*). Thus the conformally separable tensor 
under consideration takes the form 


0 
(7.10) = [ 
where y #const. On account of (7.3) and the fact that the fundamental tensors 


Zpq and *gy,.= "gp, have identical Ricci tensors, equation (7.2)3 for the tensor 
(7.10) can be written 


(7.11) Rog = Roe = {im ~ De + (m — n — 1) 


(*) Equations (3.11) of Fialkow [7] represent a necessary and sufficient condition for an 
E, to be imbeddable in an Z,,: as a member of ©! isometric, non-parallel, totally geodesic hyper- 
surfaces. In p. 427 (line 18) of the same paper, we find the sentence “According to Brinkmann 
(3.11) is the necessary and sufficient condition that E, be conformal to another Einstein space 
by means of a transformation d5=oeds with 4:00, where Ayo =f‘i¢,4¢,;.” This sentence is not 
entirely correct; in fact, the condition is sufficient but not secessary. To explain, we use Fial- 
kow’s notation. Brinkmann’s original necessary and sufficient condition referred to above is 
[3, p. 125, Theorem II] that a coordinate system exists in which the fundamental tensor of EZ,” 
is of the form . 

fan = + + 
(4) Sau = + 2Ax* +4)F,(%), far =9, 
A and d constant, 

and the form Fy (x’) dx*dx* is the fundamental form of an E,-:. Equations (3.11) of Fialkow 
[7] differ from (4) by the absence of the constant A, and, (A) can be reduced, by putting 
= x*+B, B =const., to (3.11) of Fialkow [7] when c¥0 but not when c=0. This justifies our 
statement. 


186 


1943] SOME EINSTEIN SPACES 


Hence 


THEOREM 7.3. If the improperly conformally separable tensor (7.10) of the 
type (n, m—n> 2) represents an En, then its second component tensor *gpq=Y"Znq 
represents En_»'S. 


We shall now consider separately the following three subcases: (1) n=1; 
(2) m>1 and g;; represents an E, or S2; (3) n=2. 

Subcase 1. n=1, m—n>1. 

The conformally separable tensor in question is 


811 0 
(7.12) = | y = y(x') const. 
With 2, x, gi1, y*gp_ in place of m—n, x™, eo~*, p~*g;;, respectively, the case 
considered in §4 reduces to the present subcase. Hence from Theorems 4.1 
and 4.2, we have 


THEOREM 7.4. In order that the improperly conformally separable tensor 
(7.12) with m>2 may represent an E,,(c), it is necessary and sufficient that gp, 
represent an ~1(b) or S2(b) and 


(7.13) 


This result can also be proved directly from (7.2) and (7.3). 
Subcase 2.n>1, m—n>1, and g;; represents an E, or Sz. 
By hypothesis, we have, besides (7.2), (7.3), (7.11), also 


(7.14) Ri; = — (m — 1)ag;;, = const. 
In consequence of this and (7.3), equation (7.2): becomes 
(m — 1)c — (n — 1)a 


(7.15) 
m— 


This equation is of the form (2.1). And since (7.14) is satisfied, it follows from 
(2.11), that (cf. Theorem 2.3) 


(m — 1)c — (m — 1I)a 


y=ayt+f, f = const., 


which give f=0 and a=c. Therefore (7.15) is equivalent to 
(7.16) = — 
(7.17) c= 4. 


As a consequence of (7.14) and (7.16), 2:;=y~*g;; is the fundamental tensor 
of an E,(4) or S:(4), where @ is determined from (cf. Theorem 2.4) 


187 
| 
(y’)? 
b—cy 


188 Y. C. WONG 


(7.18) = — (ay? + 4). 

When (7.16)—(7.18) are used in (7.11), the latter becomes 

(7.19) = — (m— — 1)(— 2) 

which shows that g,, represents an E,,_,(—4) or S2(—4). Hence 


THEOPEM 7.5 (i). If the improperly conformally separable tensor (7.10) of 
the type (n>1, m—n>1) and its component tensor g;; represent, respectively, 
an E,,(c) and an E,(a) or S2(a), then (1) gpq represents an En_,(b) or S2(b), 
(2) y~*gi; represents an E,(a) or S2(a), and (3) c=a, b= —4. If c=a=0, the 
E,(0) or S2(0) with fundamental tensor gi; possesses a parallel vector field. 

We observe that when (7.14) is supposed, equations (7.2), (7.3) are equiva- 
lent to (7.16)—(7.19). Then, by a consideration similar to that leading to 
Theorem 7.2 (iii), we can prove the following 

THEOREM 7.5 (ii). Given an E, whose scalar curvature is zero and which 
possesses a parallel vector field or one whose scalar curvature is not zero and 
which is conformal to another E,, then the given E, can be isometrically im- 
bedded in an E,, of any dimension m greater than n+-1 as a member of »™-* 
isometric and totally geodesic subspaces E,,’s which are orthogonal to ~* totally 
umbilical subspaces En—»'s or So's. 

Subcase 3.n=2, m—n>1. 

For this case we have 
(7 .20) Ri; a(x*)g;;. 

In consequence of this and (7.3), equation (7.2): becomes 
(m — 1)¢ — a(x*) 


m— 2 


(7.21) = 


This is of the form (2.1), and consequently, it follows from (2.9) that (cf. 
Theorem 2.3) 


d 


m—2 
that is, 
a’(y)y + (m — 1)a(y) = (m — 1)c. 
Multiplying this by y™~* and then integrating, we have 
(7.22) a(y) = ¢ — (m — 2)fy*, f = const. 


Here it is evident that a(y) is constant or not according as f is or is not zero. 
With this value of a(y), equation (7.21) becomes 


1943] SOME EINSTEIN SPACES 


(7 .23) = — (cy + 
From this and (2.10), which holds by Theorem 2.3, it follows that 


b = const. 


(7 .24) = — + 


3—m 
When (7.23) and (7.24) are substituted in (7.11) for n=2, the result is 
(7.25) = — (m — 3) 

which is the condition for g,, to represent an E,_2(b) or S2(b). Hence 


THEOREM 7.6. In order that the improperly conformally separable tensor 
(7.10) of the type (n=2, m—n>1) may represent an E,,(c), it is necessary and 
sufficient that (1) gi; and y be such that equation (7.23) is satisfied with a con- 
stant f, and (2) gpq represent an En—2 or S2 of scalar curvature b given by (7.24). 


As a verification we observe that the result for subcase 2 with n=2 is 
identical with the result for subcase 3 with a=const. (that is, f=0). 

Finally, it follows from the last but one paragraph of §2 that a 2-dimen- 
sional fundamental tensor g;; actually exists whose scalar curvature is not 
constant and for which equation (7.23) admits a solution for y. Thus the 
existence of an improperly conformally separable tensor which represents an 
E,, and whose component tensors g;; and *g,, are such that, the first represents 
a V2 which is not an Sz, while the second represents E,_2’s or S:’s. This fact 
is in contrast with Theorem 6.2 of the preceding section and Theorem 8.1 of 
the following section. 

8. Separable tensors. In this section we reproduce some results of Fialkow 
concerning a separable tensor, thus completing our discussion of the con- 
formally separable tensor which represents an E,, and each of whose com- 
ponent tensors either is of dimension less than 3 or represents E’s. 

For the separable tensor 


(8.1) 
S00 
equations (3.6) reduce to 
(8.2) *Rip= 0, = Riz *Rog = Row 
Suppose that (8.1) represents an E,,(c), that is, that 
(8.3) = — (m — 1)c* gas, ¢ = const. 
Then according as n>1, m—n>1, or n>1, m—n=1, (8.2) become 
(8.4) 0=0, Riz = — (m — 1)egi;, Roq = — (m — 1) 


or 


| 

F 
q 
f 

= 
i 


190 Y. C. WONG 


(8.5) O=0, Ry = — (m— 1)cgii, 0 = — (m — 1)cgmm. 
From these we have at once the following results due to Fialkow [7 ]: 


THEOREM 8.1. A separable tensor of the type (n>1, m—n>1) represents an 
E,,(c), if and only if its component tensors gi; and gpq represent, respectively, 
an E,,(a) or S2(a) and an E,,_,(b) or S2(b), and (m—1)c=(n—1)a=(m—n—1)b. 


THEOREM 8.2. A separable tensor of the type (n>1, m—n=1) represents 
an E,,(c), if and only if its component tensor g;; represents an E,(0) or S2(0). 
Then c=0. 


V. PARTICULAR CASES 


9. S,, with conformally separable fundamental tensor. It is well known 
that the fundamental form of an S,,(c) can always be reduced to the Riemann 
form [5, p. 85] 


€;(dx!)? + --- + em(dx™)? 
{1+ + --- + 


where each e is +1. The form (9.1) is evidently properly conformally sepa- 
rable(!*) of the type (”, m—n), where nm may be any integer from 1 to m—1. 
In what follows we give a few theorems concerning a conformally separable 
tensor which represents an S,,. Throughout this section all the symbols and 
indices have the same usual meaning. 

From (3.4) it follows that the condition 


(9.1) 


(9.2) *Risy = (5p *gay — 


for the conformally separable tensor (1.1) to represent an S,, becomes 


"Rin = (¢ + — 
"Roe = (59 "8 pr — 54); 
(n 1) (Ap, = 0, (m 1)(0,0; => 0, 


9.3 1 1 
Je 
n 
1 1 1 
m—n 


For p=o=1, we have immediately [9, p. 896]: 


THEOREM 9.1. A separable tensor cannot represent an S», of nonzero scalar 
curvature. 


() The fundamental form *gagdx“dx* is said to be conformally separable, if the tensor 
gag is conformally separable, and vice versa. 


[March 


1943] SOME EINSTEIN SPACES 


From (9.3); and (9.3): we have the following theorem [14]: 


THEOREM 9.2. If a conformally separable tensor represents an Sm, then each 
of its component tensors, if it is of dimension greater than 2, represents S's. 


Since an S is necessarily an E, it follows from this theorem that all prop- 
erties of the conformally separable tensor which represents an E,, and each of 
whose component tensors either is of dimension less than 3 or represents E’s 
must also be shared by the conformally separable tensor which represents an 
Sm. This observation together with what we have obtained in §§4—7 enables 
us to prove the following extension of Theorem 9.2: 


THEOREM 9.3. If a conformally separable tensor *gas which is not an 
ordinarily separable tensor represents an Sm, then each of its component tensors 
either is of dimension 1 or represents S’s. Conversely, if *gag represents an Em 
and each of its component tensors either is of dimension 1 or represents S's, then 
the E,, ts an Sm. 


Proof. The proof of this theorem has to be carried out separately for 
several cases. Consider first the case of the properly conformally separable 
tensor, which we subdivide into the following three types: (1) m=2,m—n=1, 
(2) n>2, m—n=1, (3) n>1, m—n>1. 

For the type (n=2, m—n=1), Theorem 9.3, reduces to Theorem 4.1 for 
m=2, because an £; is an Ss. 

For the type (n >2, m—n=1), equations (9.3) become 


m 
"Rise = (C + p pm)(5;*gix — 5x 
+ pmo; = 0, 


1 a 1 
n os 
1 
* 
5mm 


The first part of the theorem follows at once from (9.4);. To prove the con- 
verse part, we suppose that *g,s, and its component tensor *g;;=p~*g;; repre- 
sent, respectively, an E,, and S,’s, and show that (9.4) are satisfied. From this 
supposition it follows that all the equations appearing in §4 (with the excep- 
tion of (4.6’)) and the equation 


(9.5) "Rise = *a(x")(8;* gin — 84% 


are true. Now equations (9.4). and (9.4); are satisfied because of (4.2)1, (4.2) 
and (4.3). And on account of (9.5), equation (9.4), becomes 


191 
1 » 
(9.4) = 
P/imm 
(~) | 
= 


Y.C. WONG [March 
= *a(x™) — *g""(om)?, 


which, by Theorem 4.1, is identical with (4.6). Finally, by virtue of (4.15), 
equation (9.4), becomes (4.14’). Thus equations (9.4) are satisfied, as was to 


be proved. 

We can now come to the properly conformally separable tensor of the type 
(n>1,m—n>1). As a consequence of Theorems 9.2 and 5.2, a properly con- 
formally separable tensor of this type which represents an S,, must be of the 


form 
0 
(9.6) = ‘aa |. 
O 
The Riemann tensors *Rég,; and Rig, of the tensors *g.s and 
0 
(9.7) 
O 
are connected by (cf. (2.5)) 


€ € e «e -1 
*Rasy = "Rasy + (Sgr: ay 647: ap)T + (7; Bay = Bas)T 


(9.8) ar 
where the colon denotes covariant differentiation with respect to ’’gas; the 
indices a, B, y, €, x, X have the range 1,---, m; and the components 


of Rig, are (cf. (9.7), (3.4)) 


i | s 
"Rin = Rijn "Roe = 


(9.9) 
"Res, = 0 if a, 8, y, € are not all in the same range. 


From Theorem 9.2 and the remark below it and also the equations in §6, it 
follows that if the tensor *g,s defined by (9.6) represents an S,,(c), then we 
have 
7 = y(x*) + 2(x’), 

M = a + ay + + asy’, T = bo + — + 

ig = — = — T’ 
(9.10) 2/pq pe 


Rise = — 81853), Roar = (88 — 0)» 

where a3;=0 unless m=2n=4. It can now be readily verified that in conse- 
quence of (9.8)—(9.10), equation (9.2) becomes a3=0. On account of this, it 
follows easily from (9.10) that if any of the component tensors 7~*Z,;; and 
7~*Z,, is of dimension 2, then it represents S,’s. The first part of Theorem 9.3 
for the present case is thus proved. We now suppose that *g.s represents an 
E and its component tensors S’s. Then (9.10) are satisfied with a;=0, and 


1943] SOME EINSTEIN SPACES 193 


consequently, equation (9.2) is satisfied, as follows from the sentence below 
(9.10). Hence the Z,, in question is an S,, and the proof of our theorem for 
the case of a properly conformally separable tensor is completed. 

Finally, for an improperly conformally separable tensor with p=1, 0,00, 
equations (9.3) become 


= c(jgix — 5xgis), 


"Roar = (¢ + — 8: 
(m 1)d,0; 0, 


= — 
i 


If we compare these equations with those appearing in §7, we shall see easily 
that Theorem 9.3 is true. Theorem 9.3 has thus been completely proved. 

10. E, with conformally separable fundamental tensor of the type (2, 2). 
Let us review the cases we have considered for a conformally separable tensor 
which represents an Ey. 

The type (3, 1). We have considered only the particular case when the 
first component tensor g;; represents E,’s, that is, S;’s. By Theorem 9.3, an 
E, with such a conformally separable fundamental tensor is an Sy. 

The type (2, 2). If *gag is a properly conformally separable tensor, then 
either its component tensors both represent S,’s or neither of them does 
(cf. Theorem 6.2); for the respective cases, the E, is an S, or not an S, (cf. 
Theorem 9.3). If *gag is an improperly conformally separable tensor with 
p=1, 000, the component tensor *g,, necessarily represents S,’s, while 
the other, g;;, may or may not (cf. Theorem 7.6); for the respective cases, 
the E, is an S, or not an S,. Finally, if *gas is a separable tensor, its two 
component tensors each represent an 5S: of equal nonzero scalar curvature 
(cf. Theorems 8.1 and 9.1). 

Hence we have three and only three cases in which the E, is not an S,; 
they are the cases of Theorem 6.4, Theorem 7.6 for m=4 and f#0, and 
Theorem 8.1 for m=2n=4 and a0. From these theorems and (2.14), (9.1) 
we have at once the following 


THEOREM 10.1. A conformally separable fundamental form of the type (2, 2) 
represents an E, which is not an S,, if and only if it can be reduced to one of the 
following forms (in which each e is +1; A, B, C, D are constants, and A #0): 


(dx')? 
(1/3)(x1)* + Bx! +C 
(dx*)? 


(1/3)(x*)? + Ba? + D + e[(1/3)(x*)* + + 


+ e2[(1/3)(x!)? + Bx! + C](dx*)? 


A(x! x3)-? { 


(1) 


(9.11) 


Y. C. WONG 


(dx')* 


(2) (x)? [es(dx*)? + e(dx*)?] 


[1 + (B/4) (eal 28)? + 
e1(dx')? + e(dx*)? es(dx*)? + e(dx*)? 
[1 + (4/4) (ex(x")* + e2(x*)*) [1 + (4/4)(es( 24)? + 
Form (3) has been obtained by Kasner [11], and a form which is essen- 


tially the same as (2) for B=1 by Kottler [12, p. 443]. Form (1) however 
seems to be introduced here for the first time. 


(3) 


REFERENCES 


1. E. Bompiani, Spazi Riemanniani luoghi di totalmente geodetiche, Rend. Circ. Mat. 
Palermo vol. 48 (1924) pp. 121-134. 

2. H. W. Brinckmann, Riemann spaces conformal to Einstein spaces, Math. Ann. vol. 91 
(1924) pp. 269-278. 

3. , Einstein spaces which are mapped conformally on each other, ibid. vol. 94 (1925) 
pp. 119-145. 

4. A. Delgleize, Sur les equations de Weingarten et les espaces pseudosphériques, Bulletin de la 

Societé royale des Sciences de Liége vol. 4 (1935) pp. 158-161. 

5. L. P. Eisenhart, Riemannian geometry, 1926, Princeton. 

6. A. Fialkow, Einstein spaces in a space of constant curvature, Proc. Nat. Acad. Sci. U.S.A. 
vol. 24 (1938) pp. 30-34. 

7 , Totally geodesic Einstein spaces, Bull. Amer. Math. Soc. vol. 45 (1939) pp. 423- 
428. 

7’, ———, Correction to “Totally geodesic Einstein spaces,” ibid. vol. 48 (1942) pp. 167-168. 

8. , Conformal geodesics, Trans. Amer. Math. Soc. vol. 45 (1939) pp. 443-473. 

9. F. A. Ficken, The Riemannian and affine differential geometry of product-spaces, Ann. of 
Math. (2) vol. 40 (1939) pp. 892-913. 

10. E. Kasner, An algebraic solution of the Einstein equations, Trans. Amer. Math. Soc. vol. 
27 (1925) pp. 101-105. 

$3. , Separable quadratic differential forms and Einstein solutions, Proc. Nat. Acad. 
Sci. U.S.A. vol. 11 (1925) pp. 95-96. 

12. F. Kottler, Uber die physikalischen Grundlagen der Einsteinsch Gravitationstheorie, 
Annalen der Physik (4) vol. 56 (1918) pp. 401-462. 

13. J. A. Schouten, and D. J. Struik, On some properties of general manifolds relating to 
Einstein's theory of gravitation, Amer. J. Math. vol. 43 (1921) pp. 217-221. 

14. K. Yano, Conformally separable quadratic differential forms, Proc. Imp. Acad. Tokyo 
vol. 16 (1940) pp. 83-86. 

15. , Concircular geometry, 1. Concircular transformations, loc. cit. pp. 195-200. 

16. , Concircular geometry, I1. Integrability conditions of py» =u», loc. cit. pp. 354—- 
360. 


17. Yung-Chow Wong, Family of totally umbilical hypersurfaces in an Einstein space, Ann. 
of Math. (to appear). 


CAMBRIDGE, Mass. 


194 


ON THE DIRECT PRODUCT OF BANACH SPACES 


BY 
ROBERT SCHATTEN 


Introduction. Two Banach spaces E; and E, may be combined in two 
different ways; the well known E:@£E:2 and £:@£:2. While £i:@£: refers to a 
space of pairs {f, g} which are added vectorially, E:@Es is a linear vector 
space determined by “products” fi@g¢, and the f @¢’s are linearly independent 
except when we have a relation, which is a consequence of the fact that the @ 
operator is distributive; for instance 


(fi + fe) (91 + = fir Gi t+ fi @ + fe @ gi t fe @ or 


The notion of E,@£Ez: for the case of finite dimensions has been already 
mentioned by H. Weyl [16](#). With each vector e: resp. é: in a space E; of m, 
resp. E2 of m dimensions, there is associated a vector ¢;@é2 in the space of m-n 
dimensions. The totality of vectors ¢,@¢e2 do not themselves constitute a 
linear manifold, but their linear combinations ‘fill the entire “product space” 
Ei: 

The operator ® has been used for finite-dimensional /, (termed /,,,) by 
F. J. Murray, in order to show that there exist linear manifolds without com- 
plements [8]. It has also been used by the same author in treating bilinear 
transformations in Hilbert spaces [9]. 

The algebraic aspects of the ® operator, for the case of finite-dimensional 
spaces, has been discussed by Hitchcock [5, 6], and Oldenburger [13, 14]. 

The study of the ® operator for infinite-dimensional spaces requires a 
more abstract method. A complete discussion for Hilbert spaces has been 
given by F. J. Murray and J. von Neumann [10], who did not make use of 
the existence of a basis. A few special results for L, spaces have been obtained 
by Bourgin [2]. 

It should be pointed out, however, that so far the study of the @ operator, 
assumed either the existence of an inner product, or a basis, or a projection 
with bound 1. 

The object of this paper is the study of the @ operator in a most general 
form, for any Banach spaces. 

For fE Ei, p€ Es, we construct “products” f@y. With these we form a 
linear set M(Z:, Ez) consisting of all “expressions” (that is, finite sums) 
df: @¢i. These expressions must first be considered algebraically, however, 
since the distributive property introduces certain linear dependencies (§§1, 2). 
The next problem is that of defining a norm, and the space £; ® E: is obtained 


Presented to the Society, October 25, 1941; received by the editors January 26, 1942, 
() Numerals in brackets refer to bibliography at the end of this paper. 


195 


196 ROBERT SCHATTEN [March 


by “closing up” the set A(#:, E2). We are only interested in those norms 
which are “crossnorms” that is, ||f@¢l| =||f|| ||¢|| for fE#:, With the 
expressions > ,f;@¢; it is desirable to consider the set U(E,, Ez) of expressions 


Fi ® where F; € Fi, € Ex. 


For a given norm in (Ei, E2:) we construct an associate norm in A(E,, Es) 
(§3). We prove the existence of the greatest crossnorm, the least crossnorm 
whose associate is also a crossnorm (§4), as well as that of a “self-associate” 
crossnorm, which is a generalization of the crossnorm given for Hilbert spaces 
by F. J. Murray and J. von Neumann [10], (§5). 

Finally, we mention a few unsolved problems in connection with the work 
in preceding sections (§6), and indicate possibilities for the construction of 
“general crossnorms” (§7). 

These problems were suggested to me by Professor F. J. Murray, who also 
pointed out the algebraic discussions given in §§1, 2. 

1. Let Z; and Ez denote two linear spaces. We introduce two symbols 
and -+-. With these for fi, --- ,f, in and gi, «+ - , g,in EZ: we construct 
formal expressions f1@¢1- + We may abbreviate 
this by writing }-7_,f:@g;. Between these we introduce a relation ~ subject 
to the following rules: 

1. If Pisa permutation on 1, 2, - - - , m, and P(z) denotes the integer into 
which P takes 7, then 


@ei~ Dd few 
rer 


t=] 


2a. (ff +f1') @gi- + +: +++ + -fi’ @gi- + -fe 
2b. (oi - + +: -+ -fi@el’-+-fe 


+--+ ++ -+-fr@eGn. 


DEFINITION 1.1. Two expressions )-7_,f:@g; and >." .g;@y; will be termed 
equivalent, if one can be transformed into the other by a finite number of successive 
applications of Rules 1, 2, 3. We write this 12; 


Rule 1 shows that ~ is reflexive, that is, every expression is equivalent 


to itself. The definition also implies transitivity. 
A number of elementary results can be easily obtained. For instance, if 


DK fi and @¥i~D gi Ov 


t=] f=1 j=1 j=1 


n n 
n n’ m 
then 
’ 


DIRECT PRODUCT OF BANACH SPACES 


n’ 


fi @ -+- @y}. 


t=1 j=1 


fi @ ort: Digi 
j=1 


Because of Rules 2a, 2b, m is not an invariant under equivalence for the 
expression )-7.1f:@¢;. However, we now define a quantity “the rank of 
uf: which will be shown to be an invariant. 


DEFINITION 1.2. Consider >-7_:fi@gi. Let us suppose that the set of fis 1s 
k-dimensional and that the set of ts l-dimensional. Let gi,---, ge be k 
linearly independent elements in the set of linear combinations of the f;, and 
Wi, ---, 2 bel linearly independent elements in the set of linear combinations 
of the gi. Then 


k l 
(i) 
fi = Das v= fon 1 Sign, 
p=i 


n n k 
Le’ )o( 
t=1 i=] p=1 q=1 
on DX b, @ ve 
p=1 g=l1 i=] 


p=1 g=1 
We define the “rank of the expression >-*_,f:@¢;” as the rank of the matrix 
(dp,¢) p=1, “ee k; q=1, 1. 

To justify this definition we show that the rank does not depend upon 
the choice of the g, or w,. For suppose we had taken gi,---, g¢ and 
vi,---,wWi instead of gi,---, g, and ---, above. Then we can show 
that 


n k 
i=l 


p=1 


where (a,,,) is equal to a product B(a,,,)C; B resp. C being nonsingular square 
matrices with k resp. / rows. But the matrix B(a,,,)C is of the same rank as 
(@y,,). Thus the rank of an expression is independent of the choice of the g, 
and 

Let us define f,* by means of the equation f* =)_3_ ,@p,¢8p. Then, inasmuch 
as the y;are linearly independent, the rank of the expression ),7.,f:@¢i is the 
number of f* which are linearly independent. 


LEMMA 1.1. The rank r of an expression pa a fi@ey: is an invariant under 
equivalence (Definition 1.1). 


1943] 197 
n m m’ 
and 


198 ROBERT SCHATTEN ° [March 


Proof. It is easily seen that Rule 1 does not affect r. In 2a, let f:=fi +f’, 
h=(1/2)(ff —fi’). Then ff =(1/2)fit+A, fi’ =(1/2)fi—A&, 


@ gir +- @ vi & ((1/2)f1 + 4) @ or 
im? 
-+-((1/2)fi — @ er +: @ 


Now, if h=)-*_,a,g», a calculation shows that the f* in the above are the 
same for both sides of the relation. On the other hand, if h is linearly inde- 
pendent of gi,---, ge, we May put But will appear in the f* 
only with zero coefficients and again the f* are the same for both sides. Since r 
is the number of linearly independent f**, r must be also the same on both 
sides. 

A similar discussion will show that r is also unaffected by 2b. 

In 3, if any term a; is zero, we may take f; as zero. If g; is also zero, we 
may disregard the term. If ¢; is not zero, it may be taken as y. Then it is 
easily seen that this term does not contribute to the f*, and hence does not 
affect the rank. Thus an expression Di (ass) @¢; has the same rank as an 
expression in which the terms with a;=0 are disregarded. A similar statement 
holds for (a.gi). 

This implies that we need consider only the case in which each a;+0. 
But here we may take the same g, and y, on both sides of the relation. We 
then see that ab has the same value for both sides. Hence the matrix 
(@y,,) and the rank are the same in both cases. 

Thus the rank is unaffected by each of these rules. It follows that it is 
unaffected by any sequence of applications of these rules, and hence it is 
invariant under equivalence. 


Lema 1.2. Every expression is equivalent to either 0@0, or to an 
expression > 1.12: in which both the gi, - ++, gmand , Wm are linearly 
independent. Furthermore, m equals the rank of >-7-:f: @¢i. 


Proof. It is readily seen that if in either of the sets f;,- +--+, fnj@1,°++,@n 
the elements are linearly dependent, then >-7_,/:@¢; is equivalent to an ex- 
pression involving only n—1 terms. For instance, if fi=) ?2a;f;, then 


(afi) @ gi-+- 


(aigi)-+- 


= fi ® (agi + 


i=? 


vi 
vi 


1943] DIRECT PRODUCT OF BANACH SPACES 199 


We may therefore continue to reduce the number of terms until we have 
either 1: @y, in which both the gi, +--+, gm and Wmare linearly 
independent or f @0, or 0@¢. But f @0—~f @ (0-0) —~(0f) @0~0 @0. Similarly 
08¢~080. 

Now the expression }_7,g;@y; with both the g; and y; linearly independ- 
ent has rank m. For the g; and y; can be used as in Definition 1.2. The result- 
ing matrix is (6;;), 4, 7=1, - - - , m. Since, however, the rank of an expression 
is invariant under equivalence (Lemma 1.1), m must be also the rank of 

Lemma 1.3. Suppose that in the expression the are linearly 
independent. Then the rank of this expression is r, the number of the f;’s which 
are linearly independent. In particular, if >-7_,f:®g:-~0@0, each f; is zero. 

Proof. If r=0, the rank is zero also. Suppose r#0. We may assume that 
fi, +++, fe are linearly independent, since otherwise a permutation of the 
terms (Rule 1) will give this. Then fr4p=)_t-10p,4f%. Hence 


® (+ Op Life @ vi 
k=l p=1 k=l 


nr 
ve = Get 
p=1 


Since the ¢;’s are linearly independent, that is also true for the y,’s. But the 
rank of > °3.1f+@yx is r (Lemma 1.2), therefore the rank of }-7_1f:@g; is alsor 
(Lemma 1.1). 

Furthermore, from Lemma 1.2, follows that an expression is equivalent 
to 0@0 if and only if its rank is 0. From the preceding we see that the rank 
is zero, if and only if each f; is zero. This implies the last statement of our 
lemma. 

Coroiary. If 18: @giand the y; are linearly independent, 
then f;=g;fori=1, 

Lemna 1.4. If D7.:fs@ei~0@0, and k of the f; and | of the y; are linearly 
independent, then k+lSn. 

Proof. Suppose that the g,---,¢g; are linearly independent and, 
145 => 019i, for Then 


where 


200 ROBERT SCHATTEN ~ [March 


Thus Lemma 1.3 implies f;+) for j7=1,---, 1. The / relations 
between the f; are linearly independent, and hence there can be at most n—/ 


of the f; linearly independent. Thus k<n—1. 

Lemma 1.5. If and each of the sets fai 
Oni Wm, are linearly independent, then n=m, 
and there exists a square matrix (a;,;), 1, 7=1,-+-, m, with an inverse (A;,;), 
such that 

j=1 

Proof. Lemma 1.3 states that m and m are the ranks of the corresponding 
expressions. Thus Lemma 1.1 implies that they are equal. 


t=1 


Inasmuch as fi, - - - , fs are linearly independent, Lemma 1.4 implies that at 
most of the set ¢i1,--+, Gn, ¥i,*+*, Wa, are linearly independent. Since 
¢1,° are linearly independent, each must depend upon these. Thus 
Vi But since yi, ---, are linearly independent, the matrix 
(a:,;), t, 7=1,--+, # must have an inverse (A;,;), 4, j=1,---, m, and 


Substituting in (1), we get 


br) @ 8 0. 
1 


k=l 
Lemma 1.3 now implies that 


DEFINITION 1.3. The set of all expressions of the form > 3. 1fs@¢i, we denote 
by U(E1, Ex). Let f denote the set of all expressions equivalent to a fixed expression 
yn. fi: @ei. If an expression is in f it will be termed “an expression for f.” Let 
U*(E1, E2) denote the set of such f's. 


If >-7.1f:@¢; is an expression for f, >-7.,g;@y; an expression for Z, a a 
number, we define af as the set of expressions equivalent to >°2. (af) @¢:, and 
as the set of expressions equivalent to 712; 

It is a consequence of Definition 1.1 and Rules 1, 2, 3, that af does not 
depend on the particular expression used. Similarly $+ is defined uniquely. 

It is easy to see that the usual properties of addition and multiplication 
by a scalar hold; for instance f+z=Z+f, 


=af tag 
(aB)f = a(sf). 


and 


1943] DIRECT PRODUCT OF BANACH SPACES 201 


The zero element 0 is the class of all expressions equivalent to 0@0. Thus 
%*(E;, Es) is a linear set, that is, a commutative group with scalar operators. 

Sometimes we will find it convenient to permit “an expression for f” to 
stand for f. 


2. LEMMA 2.1.a. If F is an additive and homogenous functional on E,, and 
fi @ D Oy; then 


F(fde: = LF (gdvi- 
i=] i=1 
Proof. Consider first }-7.1/:®¢;. Suppose that ¢, is linearly dependent on 
G1, Pn = Then (fitagfn) Oy; and 


> F(fie: = + F(f,) ( 


t=1 


x (F(fi) + 


F(fi + aifn)gi- 


A similar statement holds in the case in which f, is linearly dependent on 
fi, - + +, fn. These results can be applied successively in such a way that one 
the f/’s and ¢g/’s linearly independent. Suppose one has gone through the 
corresponding process with >_7,g;@y;. The conclusion of our lemma is then a 
simple consequence of Lemma 1.5. 

Let E! resp. Ei denote the set of additive and homogenous functionals on 
E, resp. Es. For Fi,--+, in Et, and ¢1, -- in El, we construct ex- 
pressions > 

A similar reasoning to that in Lemma 2.1.a proves 


LemMA 2.1.b. Jf OV; then for fCE:, we have 
Combining these results we easily obtain 


2.1.c. If 


DXF; 8 DF} @ 4} 


1 
and 
then 


ROBERT SCHATTEN 


(Sr evi Es ® ot) 


where under we understand 


LD LF 
jul i=l 
DEFINITION 2.1. A set S of additive and homogenous functionals on Ey, will 
be called fundamental if F(f)=0 for all FES, implies f=0. 


LEMMA 2.2. Let S® denote a fundamental set contained in E!, and S® a 
fundamental set contained in E}. Then if for an expression >-*_,f:@¢i, we have 
=0 for all 6ES” then fi: 


Proof. Suppose where the f/’s and ¢/’s are 
linearly independent, then for a certain F°€.S* we have F*(f{) #0, and since 
the g/’s are linearly independent >-*.,F°(f! )g/ #0. Therefore for a certain 
we have #0. Lemma 2.1 gives ¥0. 
This completes the proof. 


THEOREM 2.1. Let S° resp. S® denote fundamental sets of additive and 
homogenous functionals on E,; resp. Es. Then, a necessary and sufficient condi- 
tion, for the expressions — 


and @¢! 
t=] tol 


to be equivalent is that 


t=—1 

Proof. The necessity follows from Lemma 2.1. For the sufficiency put 
=Gn4i; 1SiSn’, then according to our assumption, 


nt+n’ 


Dd = 0 for F € € S™, 


i=1 
Lemma 2.2 gives 


nt+n’ 


3. For our further considerations we shall assume that EZ; and E: are two 
Banach spaces, and denote by E; resp. Es the space of linear functionals on 
E, resp. Es. 


DEFINITION 3.1. Under a norm N in U(E;, Es) (Definition 1.3) we shall 


202 [March 
n n’ 


1943] DIRECT PRODUCT OF BANACH SPACES 203 


understand a non-negative function of expressions satisfying the following con- 
ditions: 
I. =0 if and only if 
II. N(Q1-1(af:) = a|N for any real number a. 
Ill. fi @ei- +: tmifi @¢/) Ss NOK fie.) fi 
IV. fi @e) if! Oe!) if @¢!. 


DEFINITION 3.2. Consider the set A(Fi, Ex) of expressions of the form 
where F,,--+, Fm are in Ey, and bm are in Es. For a 
fixed expression in we define as the 
smallest number satisfying the inequality 


j=1 j=l 


for all sf: in U(E:, E2). 
Thus is a function of expressions in (Fi, 


Lemma 3.1. If W is finite for every expression in U(Ex, E:), then WN also 
satisfies conditions I-IV. 


Proof. I. If then Lemma 2.1.c gives = 0. 
Suppose F; where the Fj’s and are line- 
arly independent (Lemma 2.2). If f in E; is such that F/(f)#0 then 
} #0 since the ¢j’s are linearly independent. Let pC be such 
that F} (v) Lemma 2.1.c gives F;@¢,)(f @¢) ¥0. This im- 
plies >0. 

II and III are immediate. 

IV is a consequence of Lemma 2.1.c. 

For any Banach space we may assume ECE. 


Lemma 3.2. WSN for all expressions in 


Proof. Let fi, fo, ise. ¢2,°* + denote elements in resp. 
F,, Fe, +++ resp. $1, $2, + , denote elements in resp. E2; Fi, Fi, resp. 
$3, denote p in resp. For an element in 
F(f*) = is an element of and || F**|| Therefore an expression 
will correspond to an expression where the f? and 
F}°, as well as the ¢? and ¢}° are connected by the above mentioned relation. 
We have 


| 
4 
i 
i 
| 
a. 
‘i 
i 


204 ROBERT SCHATTEN . [March 


where sup (that is, the least upper bound) is taken over all expressions 
F;@¢; in Ex), which are not equivalent to 0@0 (of rank 0). There- 


fore 


¥( 


Sree) 


Let be given. There exists an expression 7, F?@¢° such that 


t=1 


From the definition of WV for a given N, we have 


t=1 i=1 j=l 


The last two inequalities give 


v( rie > e!) 


t=1 t=] 


This completes the proof. 


Lema 3.3. If N° and N® denote two functions in satisfying condi- 
tions I-IV and N°s then N°=N. 


The proof is a simple consequence of Definition 3.2 and Lemma 3.1. 

Consider the set %*(i, Ex) (Definition 1.3). Put N(f) =N(-2.f:®@¢,) for 
i fi @¢: in f. N(f) is single-valued, as follows from IV for N. Conditions 
I-III tell us that N(f) is a norm in A*(Ej, E2). 

Similarly, considering sets of equivalent expressions in A(Z,, E2) as new 
elements F, we obtain a set and a norm in A*(Z,, Ez). N(F) 
will be called the norm associate with N. 

We complete to by adding new elements, namely all 
fundamental sequences (satisfying Cauchy’s condition) of elements in 
W*(E,, Ez) with the following conventions: 

(a) An element f of A*(E;, Ex) will be considered identical with the 


sequence f, f, f,---. 
(8) Two fundamental sequences 


z(1) (1) (1) 4 (2) (2) (2) 


1943] DIRECT PRODUCT OF BANACH SPACES 
are considered identical if and only if 

) 

lim (fs — fa) = 0. 


(vy) The norm of a fundamental sequence fi, fo, fs, --- is by definition 
lima. N(fa). 

In a similar way we complete A*(Fi, to @ E:. 

From Lemma 2.1.c follows that depends only 
upon F resp. f, for which -% ,F;@¢; resp. >.” ,f:®@; are expressions; F(f) is 
therefore a uniquely defined number for f in Ex), F in A*(E,, 


Lemna 3.4. If F,, F2,--- resp. fi, fo, - ++ denotes a fundamental sequence 
of elements in U*(Ex, Es) resp. U*(E1, Ex), then the sequence Fi(f,), Fe(fs), - 


ts convergent. 


The proof is immediate. 

Without fear of misunderstanding, we shall also denote the elements of 
E,@ Ez resp. E:@Ez by f, resp. F. 

From Lemma 3.4 follows that F(f) is uniquely defined for f in E,@ Es, 
F in E,@E; and |F(f)| < N)N(f). 

Let F* be an element of E;@E2. It is a consequence of Lemma 3.4 that 
F°(f) is a linear functional on E1@E2. We shall write therefore 


E,®@ E: C E,(?). 


We shall assume further that N defined on Y(E;, Ee) in addition to I-IV 
satisfies also the following condition of continuity: 

V. N(OLifi@¢) is a continuous function of the f; and ¢;, that is, if e>0 
is given, we can find a fai Gn) >O, such that for 
lf: || <6, || <6. for i=1, --- , n, we have 


t=y1 

Lemma 3.5. If Ei, Es are separable and N satisfies conditions I-V, then 
E,@ is separable. 


Proof. Let ff, f2,--- resp. gf, ¢2,---+ denote two sequences dense in 
E; resp. Ex. Then the set of expressions > ti, 2, 
n=1, 2, 3,---, is dense in A(Ej, Ex). Hence the set of elements f for which 
these expressions stand is dense in therefore also in Es. 


DEFINITION 3.3. A function N of expressions in U(E:, Es) is called a cross- 
norm if in addition to I-IV it satisfies the following “cross-property” : 


Nf ® ¢) = (|All for fE Eye € Es 


(?) A supplementary remark is made in Part A, §6. 


| 
205 | 
| 
iF 
if 
at 
\ 
/ 
| 
j 


206 ROBERT SCHATTEN . 
LemmMaA 3.6. A crossnorm N satisfies condition V. 
Proof. This is an immediate consequence of the following relation 


t=] 


Ke @- of) + 1) o!)) 


— sll + Dilla — + sel lee — 


Sometimes we shall assume that the norm N satisfies the following condi- 

tion: 

VI. for F in E,, and in 

DEFINITION 3.4. From Lemma 2.1 follows that >-7_,F(fiex is invariant 
under equivalence. Let f be an element of X*(E:, Ex) for which >? .fi:@¢i is an 
expression. We define T;F as the transformation from E, to Es such that 

Lemna 3.7. If T7=0 then f=0. 


Proof. Let f#0 and 5-7_,f;@¢; be an expression for f for which the f,’s 
and ¢;’s are linearly independent. Thus f:#0. We can find an FCEj, such 
that F(f:) #0. This implies since the ¢;,’s are linearly inde- 
pendent. Thus 77¥#0 and 77#0. We have shown that {#0 implies 770. 
This completes the proof. 


LEMMA 3.8. Condition II, IV, and VI for N imply I. 


Proof. From II and IV follows that >-2.if:@¢i0@0 implies 
fi =0. Now let =0. V1 implies || =0 for 
F in E;. Hence fi by Lemma 3.7. 

4. Among the functions N, a particular one which we shall denote by N; 
will be of great interest to us. 


DEFINITION 4.1. Let Dfi@¢! be a fixed expression. We define 
@¢?) =inf where inf (the greatest lower bound) is taken 
over the set of all expressions fi equivalent to fr@¢?. 


Lemma 4.1. N; satisfies conditions II, III, IV, and VI, therefore also I by 
Lemma 3.8; Ni is a crossnorm, therefore it satisfies V as follows from Lemma 3.6. 


Proof. That JN; satisfies II and IV is obvious We proceed to prove III. 
Let @¢? and @y? be two given expressions, and let e>0 be given. 


1943] DIRECT PRODUCT OF BANACH SPACES 


Take an expression 
vi 
t=1 
such that 
r m 0 0 
Isl lied @ ot) + 
Similarly we find 
LK 
i=1 
such that 
We have 
tol t=] 
Condition IV and Definition 4.1 give 


e+: Dele vt) iledl +E ld vl 


< Lhe + Dee vi) 
i=] t=] 
This proves ITI. 


We proceed to prove VI. Let >-%.,f:@g; be an expression for f. We have 
(Definition 3.4) 


This holds for every expression in-f. Hence 


[| = -imt 2 [lel = 

To prove the cross-property, we assume f¥0 and 90 (for f=0 or 
¢g=0, the proof is obvious). Let FEE, be such that F(f)=||f\|, || #7] =1. If 
we have (Lemma 2.1) 

Fell =|] or llell lledl- 


Therefore Ni(f@¢) =||f|| ||¢||. This completes the proof. 


LEMMA 4.2. N; ts the greatest crossnorm. 


207 
fl 
ha 
| 
q 
| 
ail 


208 ROBERT SCHATTEN 


Proof. For any crossnorm N, we have 


= 


Condition IV for N implies 
k 0 0 ‘ n k 0 0 
ws t=1 


where inf is taken over the set of all expressions >-7_,f:@¢g; equivalent to 


O¢?. 


LemMaA 4.3. The norm WN associated with a crossnorm N, satisfies the condi- 


tion N(F@¢) for FCE:, 6C Es. 
The proof is immediate. 
DEFINITION 4.2. We define No: 
t=1 
where sup (the least upper bound) is taken over the set of numbers obtained when 
F resp. o varies in E, resp. Es. 


Lema 4.4. No and No are crossnorms. 


Proof. It is not difficult to verify that No is a crossnorm. We shall prove 
that Wp is a crossnorm. We have 


= 

Hence (Definition 3.2) Wo (F°@¢°) S|| F*|| ||¢°||. This together with Lemma 


4.3 concludes the proof. 
Lemma 4.5. The associate N with a crossnorm N=WN,j is also a crossnorm. 


Proof. Lemma 3.3 gives N<WNp. In particular W(F@¢) < No(F@¢) 
=|| Fl ||¢|| by Lemma 4.4. An application of Lemma 4.3 concludes the proof. 


THEOREM 4.1. The associate N with a crossnorm N is also a crossnorm if 
and only if N= No; No ts therefore the least crossnorm whose associate is also a 


crossnorm. 


[March 


1943] DIRECT PRODUCT OF BANACH SPACES 209 


Proof. The sufficiency is proved in Lemma 4.5. We shall prove the neces- 
sity. Suppose that for a crossnorm N and a certain expression >-7.,f? @¢? we 
have 

N > fi ® < m( ® 
t=1 


Then, there exists an F°CE, and ¢°CE: such that 


and consequently 


t=—1 
or J is not a crossnorm. This completes the proof(*). 


THEOREM 4.1.1. A crossnorm N satisfies condition VI if and only if its 
associate WN is also a crossnorm. 


The proof is similar to that in Theorem 4.1. 

In our future work, speaking about “the least crossnorm,” we shall have 
in mind “the least crossnorm whose associate is also a crossnorm,” namely Np. 

5. The least crossnorm No, as well as the greatest crossnorm Nj, are 
defined for any two Banach spaces £;, EZ». For this reason we shall call them 
general crossnorms. Similarly (1/2)(No+ 1), (1/2)(No+™i), are gen- 
eral crossnorms(‘). 

Let K denote the smallest class of crossnorms satisfying the following 
conditions: 

1. MEK. 

2. If NEK, then VEK. 

3. If N® and N® belong to K, then aN°®+(1—a)N® belongs to K for 
0<a<i1. 

4. If N°, N%®, N°, ..- denotes a monotonic sequence of crossnorms be- 
longing to K, then its limit (which exists because the sequence is bounded by 
the least and greatest crossnorms) also is in K. 


DEFINITION 5.1. Crossnorms in K are defined for any two Banach spaces 
E,, Es. For this reason we shall call them general crossnorms. 


Lemna 5.1. If N® and N® denote two crossnorms (not necessarily general) 
and a, b real numbers, such thata+b=1, 0<a<1, then 


(*?) An immediate problem is mentioned in Part B, §6. 

(*) For a general crossnorm N, N is to have the following significance in %*(E,, E2). We 
consider N on &*(E:, E:). For this there is an N defined on %*(Ei, £2). We consider the latter 
confined to E:). 


| 
i 
. 
f 
¥ 
| H 
a 
ih 
i 
i 


ROBERT SCHATTEN 


< aN 45N, 
Proof. We shall prove that for any Fo€ U*(E:, E2) we have 


P aN (Ff) + bN@ (Ff) P N(f) Nn (f) 
where sup is taken over all f's (+0) in A*(E:, Ex). Suppose that the contrary 


holds, that is, for a certain Fy the last inequality does not hold. Then for a 
certain fy in A*(E:, we have 


| Fo(fo) | i | Fo(fo) | b | Fo(fo) | 
aN (fo) + bN (fo) N (ft) N®(fo) 
This gives: {N®(f,) —N (Fy) }2<0. This cannot happen, and therefore the 
proof is completed. 


Coro.uary. Let N® and N® denote two general crossnorms, and a, b real 
numbers, such that a+b=1, 0<a<1, then 


+ < a N° + 
Lema 5.2. If N denotes a general crossnorm and N its associate, then putting 
N® =aN+(1—a)N, 0SaS1, we have (5. 


Proof. For a=1 the statement is trivial, For a=0, V+NSN+N as 
follows from Lemma 3.2, For 0<a<1i, Lemma 5.1 gives N®+V© 
< {aN+(1—a)N}+{aN+(1—a)N} S$ N+YN as follows from Lemma 3.2. 

COROLLARY. For a general crossnorm N, we have (1/2)(N+N) 
<(1/2)(N+¥). 

Proof. + < N+ N=2N“/2 (Lemma 5.2). Hence W“/2) < N«/2), 


This completes the proof. 
We shall show an immediate application of the last result. 


THEOREM 5.1. There exists a general crossnorm N, with the following prop- 
erties: 

a. N is identical with its associate N. 

B. N=lim,.. N™, where {N (»)} is a@ monotonic sequence of general cross- 
norms defined in the following way: Let N; denote the greatest crossnorm. Put 
N® =(1/2)(Ni+™1). Let n>1, and suppose we have defined N™ for k<n, 
then put N™ =(1/2)(N®-9 4+ N-»), 


Proof. Since N; is the greatest crossnorm, its associate W; is also a 
crossnorm (Lemma 4.5), and therefore VW; 5 N;. Lemma 3.3 and the corollary 
to Lemma 5.2 give 


(®) For simplicity of notation N© shall denote the associate with N@), 


210 eee [March 


1943] DIRECT PRODUCT OF BANACH SPACES 

Vis NYs N® Ss 
The last inequality proves that W is a crossnorm, therefore N is also a 
crossnorm. Similarly we obtain 

Vis N% s SN, 

This proves that V®) is also a crossnorm, therefore N® is a crossnorm. Re- 
peating the same process, we obtain a decreasing sequence of crossnorms 

D>... 


and an increasing sequence of its associate crossnorms 
Wis NY sNOs.--., 
The first sequence is bounded from below by WM, hence is convergent; 
let N denote its limit. The second sequence is bounded from above by Ni, 
hence convergent; let Jt denote its limit. Therefore: 
(1) lim,.. =N; lim,.. 
It is easy to verify that 


— N™ s (1/2*)(Ni — Wy) for = 1,2,--- 


because 

N® — N® = (1/2)(Ni + — N™ (1/2)(Ni + Mi) — 
= (1/2)(M1 — ™)), 

N® — N® = (1/2)(N® + — s (1/2)(N® + VW) — Vo 
= (1/2)(N@ — s — ™)), 


Therefore: N=. We shall prove Since NS N™ for n=1, 2, 
N2=N™ hence (1) gives V2N or 
(2) NEN. 
On the other hand, 


Vo for m = 1,2,---; 


hence N= W and therefore (Lemma 3.2) N™2VN. (1) gives 
(3) N2N. 
(2) and (3) give the required result(°). 


THEOREM 5.1.1. Let N° denote a general crossnorm (Definition 5.1). The 
same construction (which we have applied to N, in Theorem 5.1) applied to N°, 
will always lead to a crossnorm N satisfying conditions a, 8 mentioned in 
Theorem 5.1. 


(®) The immediate problem is mentioned in Part C, §6. 


* 
i 
| 
Pri 
4 
ae 
f 
iy 


212 ROBERT SCHATTEN - [March 


Proof. Let us notice that for a general crossnorm N®*, we have always 
Nis N°SN,, hence also N°S.N,; by virtue of Lemmas 3.2 and 3.3. 
This assures that the associate with a general crossnorm is also a crossnorm. 
We put N® =(1/2)(N*+ N°), N™ is a crossnorm and V™ < N® (Lemma 5.2) 
is also a crossnorm. We have therefore the situation mentioned in Theorem 
5.1..Let m>1, and suppose we have defined N™ for k<n, then put 
N®™ =(1/2)(N®-9+4+N@-»), As in Theorem 5.1 we prove that lim,., N™ 
exists; call it N, and lim,.., ¥™ exists; call it 9t. Further, 

—N™ (1/27)(N® — for n = 1,2,---; 
hence N=. Finally we prove (in exactly the same way as in Theorem 5.1) 


that V=N. 
We shall apply our results to Hilbert spaces. Let £1, E2, denote two Hilbert 


spaces, and N a crossnorm in &(E;, E2) (Definition 1.3). 
DEFINITION 5.2. N will be termed self-associate if for every expression 
F;O¢; in Ex), we have 
Md = N( f5®¢i), 


where f; resp. p; is the element in E, resp. Ex for which F;(f)=(f, f;) in Ey resp. 
95) tm Es. 

For two Hilbert spaces £;, E2, F. J. Murray and J. von Neumann intro- 
duce the following crossnorm in A(E:, E2) [10]: 


where the symbol (f;, f,) denotes the inner product. Let WV denote the associ- 
ate with N, and let 


> 
be a fixed expression in A(Fi, Ez). We have 
@ 


jol 
where F9(f)=(f, ¢?). Applying Schwarz’s inequality to the 
numerator, we get 


1943] DIRECT PRODUCT OF BANACH SPACES 213 
nun ™ 0 0 1/2 n n 1/2 
= sup ({ in Men { LD (fir fe) oo} 


bol 
/ { (fir fe) (Gis 

or 


On the other hand, taking in particular }7. ,f? @¢? for the variable expression 
>t. f:®¢:, we get from (1) 


This’ means @¢%). The last inequality together 
with (2) proves that 


= 
j=l j=l 
or the crossnorm introduced by F. J. Murray and J. von Neumann is self- 


associate in the sense of Definition 5.2. 
We shall denote this crossnorm by Sx,v. 


THEOREM 5.2. If E, and E; denote two Hilbert spaces, then every self-associate 
crossnorm in Es) is identical with Sy.n. 


Proof. Let N be acrossnorm in Ez). Since E2= Es, for an 
expression @¢? in U(Fi, E:) we have 


= 


where F7(f)=(f, in Ei, in Es. Taking in particular 
@¢? for the variable expression we get 


r 
4 
| 
a 
1) 
a 
i 


214 ROBERT SCHATTEN ~— [March 


In particular if N is self-associate in the sense of Definition 5.2, that is, 


N’2Suw or NZ Suv. 


Taking the associate crossnorms for both sides of the last inequality, we get 
or since N and Swy,wn are self-associate N<Su,v. This gives 
N=Svy,wn, and the proof is completed. 


THEOREM 5.3. If E; and E2 denote two Hilbert spaces, then the general cross- 
norm N constructed in Theorem 5.1 by means of the greatest crossnorm N, 
coincides with the usual self-associate crossnorm Su,n. 


Proof. From the construction of N, and E, = E;, E:= E: obviously follows 
that N is self-associate in the sense of Definition 5.2. Thus, N is identical with 


Su,w by Theorem 5.2. 
6. In this section we present some remarks about the work of the preced- 


ing sections. 

A. As a consequence of Lemma 3.4, F:® E:.C E,@E, has been proved. It 
does not appear to be a simple matter to describe the exact conditions iin- 
posed upon a crossnorm, under which the relation E:@ E:=E,@£E, holds. 

B. We have proved the existence of the least crossnorm, whose associate 
is also a crossnorm. We.did not settle, however, whether the associate with 
every crossnorm is also a crossnorm, or there exist crossnorms whose associ- 


ates are not crossnorms(’). 
C. The general crossnorm N, constructed in Theorem 5.1 by means of 


the greatest crossnorm Nj, we are justified to term self-associate (extending 
hereby Definition 5.2). Theorem 5.1.1 states that the same construction 
applied to any general crossnorm will always lead to a self-associate cross- 
norm. We did not settle, however, the problem of “uniqueness,” that is, 
whether the same construction applied to two different general crossnorms 
will always lead to the same self-associate crossnorm. 

In connection with the question of uniqueness let us mention the following 
problem. 

Let N° and N® denote two general crossnorms, and N° < N®. Under what 
conditions is N°-+*°= N+ 

7. The following crossnorms are worth mentioning. 


DEFINITION 7.1. Let be in We put @¢?) 
=inf (max ¢_s1 || %.1¢:|fi|gi|]) where inf (that is, the greatest lower bound) is 
taken over the set of all expressions equivalent to ff @¢e. 


THEOREM 7.1. N in Definition 7.1 is a crossnorm. 


__ (7) Added in proof: The author has since shown that if Z, and &; are reflexive, that is, 
E;=E;, then the associate of every crossnorm is also a crossnorm. 


| 
a 


215 


1943] DIRECT PRODUCT OF BANACH SPACES 


Proof. It is obvious that WN satisfies II and IV of Definition 3.1. We shall 
prove III. For two given expressions >-7,f:@¢:, >.1.1g:@yi and a given 
¢>0, we can find two expressions @¢!, such that 


k m 
~ fi @ vi: max 


1 


< + 6/2. 
Let ef = +1,¢=1,---,k3 nf = +1,j7=1, ---, 7, be chosen so that 


k r 
| + | 
j=1 


OW Ovi max 


elles | 


= max | lle? + Dulles 


ie 


i=1 
then IV and Definition 7.1 give 
k 
t=1 
inl 


of 
t=1 
This proves III. We shall prove VI 


F(fi)¢: 


Let ni,---, 7 denote that system of numbers 1, —1, for which the right 
side of the last inequality is a maximum. Then, 


and the proof of VI is a simple consequence of Lemma 2.1. Lemma 3.8 im- 
plies, therefore, I for N. 

We complete the proof of the theorem by showing that WN has the cross- 
property. Let f¥0, and Bei @g. Choose an FC such that 
F(f) =|{fl|, || Fl] =1. Lemma 2.1 gives 


S max 
e=t1 


or 


Al llell max 


| 

= 

| 


216 ROBERT SCHATTEN 


This proves 


N(f ® ¢) = llll- 


DEFINITION 7.2. We define a crossnorm N in A(E:, Ex) by means of the 
least crossnorm No in Ex). Let k denote a natural number. For sf: @¢i 
in we put 


Seven) 


j=l 


where sup (the least upper bound) is taken for all sequences of k terms Fi,--+, Fe 
in Ei; eee » De in 


THEOREM 7.2. For every natural k, Nw is a crossnorm, 


Proof. I. If Lemma 2.1 gives =0. If 
=0, then taking Fi\= --- =F, =F; we 
have >." ,f:@¢<~0@0 by Definition 4.2 and Lemma 4.4. 

That II, III, and IV hold is obvious. We shall prove that N@ has the 


cross-property. We have 


i j=1 j=1 


j=l 


= Ile*l. 


But on the other hand, putting Fi= --- = F,=F3gi= we obvi- 
ously get Na(f°@¢*) ||G°||. This completes the proof. 


REFERENCES 


1. S. Banach, Théorie des opérations linéaires, Warsaw, 1932. 

2. D. G. Bourgin, Closure of products of functions, Bull. Amer. Math. Soc. vol. 46 (1940) 
pp. 807-815. 

3. J. A. Clarkson, Uniformly convex spaces, Trans. Amer. Math. Soc. vol. 40 (1936) pp. 
396-414. 

4. M. M. Day, Reflexive Banach spaces not isomorphic to uniformly convex spaces, Bull. 
Amer. Math. Soc. vol. 47 (1941) pp. 313-317. 

5. F. L. Hitchcock, The expression of a tensor or polyadic as a sum of products, Journal of 
Mathematics and Physics vol. 6 (1937) pp. 164-189. 

6. , Multiple invariants and generalized rank of a p-way matrix or tensor, ibid. vol. 7 
(1927) pp. 40-79. 

7. D. Milman, On some criteria for the regularity of spaces of type (B). C. R. (Doklady) 
Acad. Sci. URSS. vol. 20 (1938) pp. 243-246. 


[March 


1943] DIRECT PRODUCT OF BANACH SPACES 217 


8. F. J. Murray, On complementary manifolds and projections in spaces Ly and ly, Trans. 
Amer. Math. Soc. vol. 41 (1937) pp. 138-152. 

9. , Bilinear transformations in Hilbert spaces, ibid. vol. 45 (1939) pp. 474-507. 

10. F. J. Murray and J. von Neumann, On rings of operators, Ann. of Math. (2) vol. 37 
(1936) pp. 116-229. 

11. J. von Neumann, On infinite direct products, Compositio Math. vol 6 (1938) pp. 1-77. 

12. , On certain topology for rings of operators, Ann. of Math. (2) vol. 37 (1936) pp. 
111-115. 

13. R. Oldenburger Composition and rank of n-way matrices and multilinear forms, ibid 
vol. 35 (1934) pp. 622-657. 

14, , Nonsingular multilinear forms and certain p-way matrix factorizations, Trans. 
Amer. Math. Soc. vol. 39 (1936) pp. 422-455. 

15. B. J. Pettis, Proof that every uniformly convex space is reflexive, Duke Math. J. vol. 5 
(1939) pp. 249-253. 

16. H. Weyl, The theory of groups and quantum-mechanics, translated from German by 
H. P. Robertson, New York, 1931. 


CoL_umBIA UNIVERSITY, 
NEw York, N. Y. 


| 
a 
i 
q 
| 
ii 
ind 
4 
il 
ll 
it 


DIRECT METHODS IN GEOMETRICAL OPTICS 


BY 
M. HERZBERGER 


Communication no. 851 from the Kodak Research Laboratories 


This paper presents a different approach to the problems of geometrical 
optics, in order to attack some problems hitherto insoluble in practice. 

The investigation has been restricted to rotationally symmetric optical sys- 
tems for practical reasons only; the procedures can be applied to the general 
case. Let us choose two rectangular coordinate systems such that the z- and 
z’-axes coincide with the optical axis, the y- and y’-, and x- and x’-axes being, 
respectively, parallel. An object ray is then defined by its intersection point 
with the plane z=0; (vector a”, coordinates x, y, 0) and by the vector s~ of 
length (n refractive index) along the ray (coordinates &, 9, ¢ = (n? — ; 
the image is given by a’~(x’, y’, 0) and s’~(&’, 9’, £2 with 
n’ the refractive index of the image space. 

The problem of the optical designer is to find the image ray if the object 
ray is given, or in other words, to compute, for a given optical system, x’, 
y’, ’, and 7’ as functions of x, y, £, and 7. If this problem is solved for a single 
surface and arbitrary positions of the object and image planes, it is solved 
for any rotationally symmetrical optical system merely by making a succes- © 
sion of substitutions. To solve the problem for a single surface, which will 
here be assumed to be a spherical surface, we first place object and image 
planes at the center of the refracting surface, and then calculate the functions. 
Having done this, we have only to compute the intersection points of the 
image rays with a parallel plane through another origin, a simple geometric 
problem. We can use this method for the manifold of all rays in a procedure 
similar to the ordinary way of tracing meridional rays; and applied to an 
individual ray, it becomes a straightforward method for tracing skew rays 
through an optical system. 


I. GENERAL FORMULAE 


Before deriving these equations, we shall inspect the general conditions of 
optical image formation('). Because of the rotational symmetry we can write 


Presented to the Society, October 31, 1942; received by the editors April 17, 1942. 

(*) The term “optical image formation” refers to the one-to-one correspondence of object 
and image rays in an optical system. Not every one-to-one correspondence can be considered 
as an optical image formation. The conditions are derived in this paper: For the mathematician 
it might be noted that an optical image formation is a special kind of contact transformation. 


218 


|| 


GEOMETRICAL OPTICS 
x’ = Ax+ Bi, = Cx + Dt 
y =Ayt+ Bn, =Cy+Dn 


in which A, B, C, and D are functions of “1, u2, and us, which are symmetric 
functions of x, y, £, and 7: 


(1) 


u, = (1/2)(x* + 
(2) ue = xt + yn, 
us = (1/2)(§ + 7°). 
However, A, B, C, and D are not arbitrary functions of 11, ue, and uz. 
Our first task is to derive the differential equations connecting them. 


According to the fundamental optical invariant(?), for any two-parameter 
manifolds (parameters u, v), we have (abbreviating 0s~/du =s,°), 


(3) Su Go — Se ™ Su Go — So Ga. 


Taking the following variables in turn for u and v: x, y; x, &; x, 9; y,&3 9,03 
and £, 7 and turning from the vectors to the coordinates, we find that 
(xzty + — + yynz) = 0, 
(4) + ying) — + = 1 


Writing A, for 0A /du,, and so forth, we now obtain from (1) and (2), 


= A + + (Ae + B,) xé + 
(5) A,xy + + + Bakn, 


Inserting (5) in (4), we obtain the equations 


(xn — yt)I = 0, 
(AD — BC) + + + = 1, 
xyl + anlI + = 0, 
xyl + + = 0, 
(AD — BC) + + yall + = 1, 
(xn — yé)IIT = 0; 
(*) For the historical background of formula (3) and its connection with different branches 
of mathematics, the reader is referred to M. Herzberger, Theory of transversal curves and the 


connections between the calculus of variations and the theory of partial differential eguations, 
Proc. Nat. Acad. Sci. U.S.A. vol. 24 (1938) pp. 466-473. 


a 
a 
en 
cn 
| 
i | Hi 


M. HERZBERGER 


I = AC, + DA; — CAz — BC; + — AL) 
+ — AD; + — + — BD,), 

(7) IT = AC; + DB, — CAs — BD, + 2u;(AsC3 — A3C)) 
+ wD; — AsD, + BiC; — BL) + 2u3(B,D; B;D,), 

IIIT = AD; + DB, — CBs; — BDz + 2u;(AX3 — AL 2) 
+ u2(A2Ds; — A3D.+ BL; — BxC2) + 2us(BoDs B;D2). 

Equations (6) can be satisfied for all rays only if 
I=J]I =0; 
(8) 
AD — BC = 1. 


The fulfillment of the four equations (8) is necessary and sufficient to 
guarantee that equation (1) determines an optical image formation. We can 
of course eliminate one of the functions, for example, D, and have three fuac- 
tions A, B, and C, and three differential equations connecting them. We find 


that 
_1+ BC 


(9) 

° Cc B 1+ BC 

D, = — B, + — C, — A 
A A A? 


Inserting this in (7), we obtain these equations: 


B A B 
(ACs — CAs) + (CAs + + (AC: — Ai) (20, + =m) 


+ (A,B +2 
1D2 2D1 us 


B 
+ + —) = 0, 


BA, + AB, 


B 
(AC; — CA;y) + (CA, — + 


(10) + (ACs AL) (20, + 


+ (A,B (< + 2 


B 
+ (BC; — BC) (us + 2us -) = 0, 


220 (March 
where 


GEOMETRICAL OPTICS 


B B AB, As 
— — Ch) (Cs — AC) + - = 
7° 3 2 2) + rr 


A A 
+ ( — £2) (2m + =m) 


+ (AoB + 2 
2D3 us A? 


B 
+ (BL; — (us + 2us = 0. 


Let us multiply these equations by A3, —Az, Ai; Bs, —Bz, Bi; C3, —C2, Ci, re- 
spectively, and add. We thus obtain three new equations, which can replace 
equations (10) if 

A, Ag As 
(11) B, Be By|=A#0; 

Ci C2 Cs 


that is, A, B, and C are three independent functions. (In this case we can 
construct their inverse functions, that is, we can calculate “:, u2, and us as 
functions of A, B, and C. We shall make use of this later.) In either case, we 
obtain from (10) three equations: 


B 
A(C2A3 — C3A2) + B(AiC3 — CiAs) + (C1A2 — 


1 B 
+ (B2A; — ByA2) + (us + = 0, 


A(C2B; — C3Bz) + B(BiCs — CiBs) + C(AsBe — A2Bs) 


1+ BC 
+> (A,B; — B,A3) 


+ =) — A,Bz) + — C2B,) 
A A A 
B 
+ (20 + = 0, 
1+ BC 


C(C2As — + (AiCs — 


rh, 1 1C2 1 C2 


Cc 1+ 
+(< Us + 


1943] 221 
(10) 
(12) 
it 
itt | 
ii 
a 
ie 


222 M. HERZBERGER 


These equations are equivalent to: 
A(AxC3 — Ai) + B(AL: — + C(A2By — 
+ A(BL; — ByC2) + 2u;A = 0, 


1+ BC 
— AL3) + 7 (A2B, — A, B2) — B(BL, — + = 0, 


(13) 
— C3Bz) + B(BiC3 — CiB3) + C(AsB2 — A2Bs) 
1+ BC 
— — + 7 (A,B; — B;As) 


+ 2u,A = 0. 


If A is not identically equal to zero, we now get very simple equations for 
the inverse functions, if we insert 


A3B, — 
A 
A,B: 


1+ BC Ous 
—+B— = 0, 


1+ BC due 
u, = 0, 
A 


as fundamental equations for optical image formation. 


II. A SPECIAL KIND OF OPTICAL IMAGE FORMATION 

As an example, let us consider a special case in which A=0. Let us assume 
B=0; then 

(16) =Ax, =Cx+ (1/A)E, 

y =Ay, = Cy + (1/A)n. 


This kind of image formation is of great importance in optics. A single 


[March 
0A A 0B A 
Ou; BiC2— BL, — 
Thus, equations (13) give finally 
Ous Ous Ous 
0B 
0A 0A 


1943] GEOMETRICAL OPTICS 223 


sphere produces this type of image; so does a system of concentric surfaces, 
a so-called concentric lens system. On the other hand, the attempt to realize 
the dream of an optical designer, a system which gives a sharp image of one 
plane z=0 upon another (z’=0), leads to an image formation given by 
formula (16) with the special condition that A is a constant. 

In general, A must be different from 0 and, inserting B= B,=B,=B;=0, 
equations (10) give 

— + (Ai/A) + (ArC2 — AXC1)2u, = 0, 

(17) (AC; CAs3) + = 0, 


— (A3/A) + — AC2)2u; = 0, 


as differential equations for the two functions. eee (12) give 
(18) A(CxAs—C3A2) =0 and C(C2A3 — C3A2) ACs — C;A;3) = 0, 


or 
(19) C2As — C342 = A1C3 — Ai = 0. 
Inserting (19) in (17), we find that 


As = Cs = 0, 
(20) 


(AC; CAs) + (A,/A) + (A,C2 AX) = 0. 


We see that in this case A and C are functions of u; and uz alone. We in- 
troduce D=1/A and find, as differential equations for C and D, 


D; = C; = 0, 


21 
( ) DC: + CD: DD, + (C,Dz DC2)2u; = 0. 


In the special case of D =const., C.=0. That means C is a function of u; alone. 
We have 


x’ = Aox, = C(us)x + (1/Ao)é, 
y =Aoy, = C(ur)y + (1/Ao)n. 


Equation (22) is a generalization of the well known sine condition of 
Abbe; Ao is the magnification of the image and (22), can be written in the 
familiar form 


(23) — = Aon’ — = C(u)y, 


where the right sides of equations (23) are independent of £ and 7. 
Let us now investigate the general case, in which C and D are eee 
functions of u; and ue, such that 


(22) 


He 
“4 i 
| 
| 
i 
. 
an 
i 


224 M. HERZBERGER . [March 


(24) CD, — CD, ¥ 0. 
Here we may again construct the inverse functions, and consider uw; and us: 
as functions of .C and D. Then we have 
Ou Dz 
— CD; 
— CD, 
Inserting (25) in (21) gives 


(25) 


(26) 


The solution of this differential equation would solve our problem. 


III. THE SINGLE SPHERICAL SURFACE 


Again we attack a special case of the aforesaid problem, namely, the single 
refracting spherical surface. The coordinate origins in the object and image 
spaces are placed at the center of the sphere, the x- and x’-, y- and y’-axes 
coinciding. Let a~ (a’~) be the vector (x, y, 0) (x’, y’, 0), respectively, to the 
intersection point of the ray with the coordinate plane 2(z’) =0; let s~(&, y, §), 
s’~(&’, n’, £9), respectively, be the vectors of length m (n’) along the object 
and image rays. 

Let r be the radius of the refracting surface, positive if the surface is con- 
vex with respect to the direction of the light ray, and negative if the surface is 
concave. Let r~ be the vector of length r along the incidence normal. The re- 
fraction law can then be written 


(27) se X re =s'-X re, or —se=Cr. 


The coordinates are then given by equations (2): 


(28) = (2° + 9°)/2 = a? /2, 
Ue = xE+ yn = a>-s>. 


Two values, \ and X’, exist such that 
(29) a> + Asm = = a’ + 


and a’~ and a~ must have the same direction, since they both lie in the inter- 
section line of the incidence plane with the plane z=0. From (29) and (27) 
we find that 


Ou, Ou, 
C— D— + D— + = 0. 
ac aD 


1943] GEOMETRICAL OPTICS 225 


This gives finally 
a’> = (1/D)a>, 
= Ca> + Ds-, 


(31) 


where C is given by equation (27)2;a~ Xs~ is an invariant vector for refrac- 
tion. Its direction is perpendicular to the incident plane; its length p is the 
length of the perpendicular dropped from the center to the incident (re- 
fracted) ray, multiplied by the corresponding refractive index. 

We have 


(32) = (a> 4 s~)? = _ (a->-s~)? = 2n?u 
Equation (31) now gives 
X se = C(a> X 


s'>-s> = Ca>-s> + 


(33) 


If 5 is the angle between sand s’ (the angle of deviation), equations (33) 
are equivalent to 


nn’ sin 6 = Cp, nn’ cos 6 = Cuz + Dn’, i 


or 
nn’ sin 6 1 nn’ sin 6 
(34) Cc = ———_» D = — cos 6 — ———— 
p n* p 


Our remaining problem is to express 6 as a function of », and then, by 
using (32), as a function of “; and ue. Equation (27). gives 


(35) 
or, because of (30), 

(36) C = (1/r){ — (p/r)2)¥? — — 
sin = (p/nn'r) — — — (p/r)?)"7}, 


cos 6 = (p?/nn'r?) + ([n’? — (p/r)?][n? — (p/r)?])2. 


(37) 


Inserting (37) in (34) gives C and D as functions of » and ws, and there- 
fore, because of (32), as functions of u; and uz. 

To solve the reverse problem, that is, to calculate u and uw as functions 
of C and D, we proceed as follows: 

Equation (37) gives 


(38a) (cos — = (1/nn')?(n’* — p*/r?)(n? — p*/r*), 


fi 

4 

(a 

mit 

il 

a 

it 


226 M. HERZBERGER 


(38b) (nn’ sin 6)* + (p?/r?)(2nn’ cos 6 — n® + n’*) = 0; 


or, considering (34), 
(39) 2nn’ cos 6 = n* + n’? — C*r?, 


Inserting this in (34)2, we obtain 


(40) 


Equation (34) gives 


nn’ sind\?2 n?n’? n? + n’? — 2 2 
C )- -( 2C ) = 


n? + n’? in C?r? D 2 
= 2n*u4, — ( =") 
2C Cc 


(41) 


which gives 


1 


D 
— Dr?/2 + (n’? — n®D), 


2C? 
n?(1 — D) + (n’? — 
2C 


We see that u; and ware rational functions of C and D, fulfilling equation 
(26). 

Equations (42) are very valuable for calculating the coefficients of C and 
D, written as a series in u; and uz. 


(42) 


us = Cr?/24 


IV. THE PLANE SURFACE 


The equations for refracting a ray at a plane surface are very simple. 
We let the z- and 2’-axes coincide with the axis of the system, and place the 
origins at the intersection point of the.axis with the plane surface. Applying 
the refraction law here gives 


(43) 
=. 


This means that the transformation is the identical transformation. 


V. TRANSITION FORMULAE 


To obtain the formulae for tracing a system of rays through a system of 
centered lenses, we must also know how to find, from the coordinates of the 
intersection point of the image ray with the reference plane through the 


[March 
= 
— 
2C C 
“= 


1943] GEOMETRICAL OPTICS 227 


center of one surface, the coordinates of the ray intersection:in the refer- 
ence plane through the center of the succeeding surface. We call the distance 
between the two centers m. From analytical geometry, we get 

m 


(1 — 2u3/n*)3/2 


= 
(44) 


y=y 


4 m 
= 
(1 — 2u3/n*)*/? 

In tracing rays, we can use the formulae described in the preceding sec- 
tion. We can simplify the calculation since, instead of tracing x, x’, y’ and y, 
it proves to be sufficient to trace “1, “2, and us, because the functions depend 
only upon these variables. Moreover, for any optical image formation, we 
have the relation 


(45) — ug = — = (2m — = — 
which follows directly from (1). The actual ray tracing formulae will be pub- 
lished later. 

VI. IMAGE ERROR THEORY 


Finally, let us sketch briefly how the image error theory can be derived, 
using this new method of approach. 

Let us develop A, B, C, and D as functions of 1, uz, and us, into a series 
and inspect the equations for small values of “1, ue, and us. 

If we assume 1, ue, and us to be negligible, we obtain Gaussian optics, 
within the realm of which A, B, C, and D can be regarded as constant values. 
We have 
(46) x’ = Agx + & = Cox + 

y =Acyt+ Bon, = Coy + Don, 
with (8): 


AoDo — Bo = 1. 


The special case, that object and image are in optically conjugated points, 
is indicated by 


(47) Bo = 0. 
Then Do=1/Ao, and 
=Aox, & = Cox + (1/Ad)é, 
y =Aoy, 9! = Coy + (1/Ao)n. 
From (48) and (46) can be derived all the laws of Gaussian optics. 


(48) 


ou 
4 
. | 
4 


228 M. HERZBERGER [March 


Let us now consider the linear members of A, B, C, and D, but neglect 
all higher orders. This leads to the so-called Seidel theory of aberrations. We 


have again 


1+ BC 
49 D= 
(49) 


but the nine first-order derivatives of A, B, and C are not independent. Equa- 
tions (9) show that between these derivatives at u; = u2=u;=0 there exist the 
three equations 

Bo A 1 
— (Ag, — CoA1) — — CoA2) = —> 
Ao A 


2 

B + 

— CoA1) — (AoC3 — CoAs) = 
0 


CoAs) Bo CoAs) = 
A? 2 0422. Ao — 


From these equations, we get 
— A:) = Ao(Bz — As), 
Bo Ai 
— = — (AoC, — — 
Ao Ao 


B BoA, — BiA 
Ag 


A? 


Let us now put the origin at the Gaussian image point, so that By=0, 
and equations (51) simplify to 


Substituting this in the first equation of (1), we find, after rearranging, 
that 
x’ — Aox = (Ayu; + + Asus)x + + + Bsus)é, 


53 
(53) y’ — Aoy = (Aim, + + Asus)y + (Bim, + Bouse + Byus)n. 


A; 
BoAs — BrAc . 
BoAz — Bz 
(50) 
(51 
Be = A;, 
A, 
AL, — = — Ay’ 
By 
(52) 


1943] GEOMETRICAL OPTICS 229 


Geometrical inspection will show that the five quantities A1, A2, As=Bz, 
B,, and B; correspond to the five image errors: 

B; represents the spherical aberration; 

A;=B, represents the coma error; 

Az and B, represent the field errors; 

A, represents the distortion. 

In (53) the coordinates x’, y’ of the intersections of the image ray with 
the image plane are given as functions of x, y, &, 7. x«=y=0 characterizes 
the axis point of the object, £=7=0 characterizes the infinite point. We can 
say, therefore, that Ai, As, As, Bi, Bz, Bs give us the image errors of our 
object for a stop at infinity. To obtain the image errors for a finite stop, we 
have to replace &, », by the coordinates, x, yp, of the intersection point of 
our ray with the plane of the diaphragm. Within the region of validity of the 
Seidel theory, we get a simple linear transformation. If k is the distance be- 
tween object and stop, we have 


ki, 
(54) 
Y=yt kn, 


Inserting (54) into (53) we find x’ and y’ as functions of x, y, x», yp, the 
coefficients being the image errors for finite stop. 

The method developed in this paper allows one to obtain the image co- 
ordinates in a rotation symmetric optical system as functions of the object 
coordinates by a series of substitutions. The only other general method having 
Hamilton’s characteristic function leads to an elimination problem, hitherto 
unsolved. 

Hamilton’s method is more elegant since it uses only a single function to 
describe an optical instrument; the method of this paper leads to four func- 
tions connected by three differential equations. However, an explicit way was 
found to construct our function for any given system of centered lenses, 
whereas the characteristic function of Hamilton is explicitly known only for 
a single refracting surface or a plane parallel plate. Thus, the new method 
seems to be more adaptable to practical problems. 

The last paragraph tries to show that the access to the image theory by 
the direct method is as simple as it is by using Hamilton’s characteristic 
function. 


RocHEsTEr, N. Y. 


wit 

=> 
k 

= 
k 

| 

ll 


MEAN-VALUES AND HARMONIC POLYNOMIALS 


BY 
E, F. BECKENBACH AND MAXWELL READE 


INTRODUCTION 


0.1. It is well known that if the function f(x, y) is harmonic in a finite 
domain (non-null connected open set) D, then at each point (xo, yo) in D, 
f(x, y) satisfies the equation 


1 
(1) (x0, Yo) = + & vo + n)dtdn 


for each circular disc 
D( 1): & + 9? = — x0)? + (y — yo)? S 7? 


lying in D. Conversely, if f(x, y) is superficially summable in the interior of a 
finite domain D, and if (1) holds for each point (xo, yo) and each disc D(xo, yo; 7) 
about (xo, yo) in D, then f(x, y) is harmonic in D(?). 

It follows that (1) may be taken as the defining equation for harmonic 
functions. 

0.2. Similarly, if f(x, y) is superficially summable in the interior of a finite 
simply-connected domain D, and if f(x, y) is summable on each circle 


C(xo, yo: + 4? = (x — x0)? + (y — yo)? = 7? 


lying in D, then a necessary and sufficient condition that f(x, y) be harmonic 
in D is that at each point (xo, yo) in D, f(x, y) satisfy the equation 


1 
(2) S(x0, yo) = (xo + & yo + n)ds 


2ar J 


for each circle C(x, yo; r) in D. 
As with (1), (2) may be taken as the defining equation for harmonic func- 


tions. 
0.3. The following theorem is analogous to a result of Beckenbach and 
Rado concerning subharmonic functions(?). 


THEOREM 1. If f(x, y) is continuous in a finite domain D, then a necessary 


Presented to the Society, December 30, 1941; received by the editors March 20, 1942, 

(2) See E. Levi, Supra una proprietd caratteristica delle funzione armoniche, Atti della Reale 
Academia Lincei vol. 18 (1909) pp. 10-15, and L. Tonelli, Sopra una proprieta caratteristica delle 
funzione armoniche, ibid. pp. 577-582. 

(?) E. F. Beckenbach and Tibor Rado, Subharmonic functions and surfaces of negative curva- 
ture, Trans. Amer. Math. Soc. vol. 35 (1933) pp. 662-674. 


230 


‘ 
i 


MEAN-VALUES AND HARMONIC POLYNOMIALS 231 


and sufficient condition that f(x, y) be harmonic in D is that for each point 
(xo, Yo) in D, the equation 


1 1 

(3) — + & vo + = —ff + & vo + n)dtdn 

hold for all D(xo, yo; r) in D. 


Proof. If f(x, y) is harmonic in D, then (3) follows from (1) and (2). 
To prove that (3) is a sufficient condition that f(x, y) be harmonic in D, 
we consider the circular average(*) 


1 
fle 9:0) = — f 


which is defined in an open subset of D. For D(xo, yo; p) in D, a computation 
yields 


d 1 
jen [flan + & yo 
dp 


p L2xp 
1 
—ff (x0 + & yo + |, 


which, with (3), shows that f(xo, yo; p) is independent of p. But since f(x, y) 
is continuous, we have f(x, y; p)—f(x, y), as p-—0, on each closed subset of 
D, so that 


1 
S(%0, Yo) = f(%0, Yo: 7) = —ff + yo + 
D(z, voir) 
for each D(xo, yo; 7) in D. Therefore f(x, y) is harmonic in D. 

If we should assume only that f(x, y) is superficially summable and satis- 
fies (3), then it would not follow that f(x, y) is harmonic; consider, for ex- 
ample, the function which vanishes identically except at the origin, where it 
assumes the value 1. 

. 0.4. The right-hand members in (1) and (2) are areal and peripheral aver- 
ages (mean-values), respectively; in each instance the range of integration is 
circular. The question arises as to the nature of the functions which are 
defined by relations similar to (1), (2) and (3), in which the geometric figure 
is square, elliptic, and so on(‘). 

In this paper we delineate the classes of functions defined by the condition 


(*) See § 1.3 below. 

(*) Cf. W. Brédel, Funktionen mit Gaussischer Mittelwerteigenschaft fiir konvexe Kurven und 
Bereiche, Deutsche Mathematik, vol. 4 (1939) pp. 3-15. By combining our methods with his, 
one can simplify the proofs of some of his results concerning general mean-values. 


iy 

| 

| 

at 

Al 

] | i 
weit 

ial 
- 


232 E. F. BECKENBACH AND MAXWELL READE {March 


that their averages over regular polygons of m sides satisfy conditions similar 
to (1), (2) and (3)(*). 

0.5. Since a circle may be considered as a limit of a sequence of circum- 
scribed (or inscribed) regular polygons, the results of this introductory section 
may be considered as limiting cases of some of the results obtained below. 


1. 
1.1. We recall that if m=2, then for all angles y we have 


n—1 n—i 
(4) sin = cos =0, 
m=( n 


m=0 


(5) (v + cos (v + = 0, 
m=0 n n 


n—1 


(6) E sin’ (v + am) = > cos? (v + am) = n/2. 


Thus 


24m 24m 
> E (v + =) + isin (v + | 
m=0 n n 
= (cosy + isiny) >. (cos + isin ), 
m=0 n n 
and this vanishes since the sum of the mth roots of unity is zero. The remain- 
ing formulas in (4), (5) and (6) can be established in a similar way. 
More generally, for 221, 


2rkm 
(7) os (v + ==) + isin (v + = + isin y), 
m=0 
where 5;,,=1 if & is an integral multiple of n, and where 6;,, =0 otherwise(*). 
1.2. P»(xo, yo; 7; 6), 223, shall denote the closed finite region bounded 
by the regular n-gon p,(xo0, yo; 7; 6), whose center is at (xo, yo) and whose 
inscribed circle has radius r; @ denotes the angle from R to N, 
where R is the ray extending horizontally to the right from (xo, yo) and 
N is the exterior normal at the point where R emerges from the polygon. 
| Pa(xo, 73 ¢)| shall denote the area of P,(xo, yo; and | pa(xo, yo; 7; ¢)| 
shall denote the length of p,(xo0, yo; 7; ). 
We enumerate the sides of p,(xo, yo; 7; @) in counter-clockwise fashion, 
So, Si, * * * » Sn—1, Where So is the side to which N is normal. 


(®) J. L. Walsh considered finite averages over the vertices of regular m-gons, obtaining 
results similar to some of the results obtained in this paper, in his article A mean-value theorem 
for polynomials and harmonic polynomials, Bull. Amer. Math. Soc. vol. 42 (1936) pp. 923-930. 

(*) Walsh, loc. cit. p. 924. 


1943] MEAN-VALUES AND HARMONIC POLYNOMIALS 233 


1.3. Lemma 1. If f(x, y) is superficially summable in the interior of a finite 
domain D, if n is a fixed integer, n=3, and if for each point (xo, yo) in D the 
equation 


1 
(8) I(xo, yo) P.(2o, Yo; 73 0) + vot n)dédn 


holds for each P,,(xo, yo; 7; 0) in D, then f(x, y) is harmonic in D(*). 


Proof. For any function F(x, y), defined and superficially summable in 
the interior of D, the areal averaging function 


1 
F(x, 9; Sf F déd 
(x, = 9:7; 0) | (x + & y + n)dtdn 


is one degree smoother than F(x, y) in the open subset D, of D, where F( x, y;1r) 
is defined; that is, if F(x, y) is superficially summable in the interior of D, 
then (for r fixed) F(x, y; 1) is (at least) continuous in D,, or if F(x, y) has con- 
tinuous partial derivatives of the mth order in D, then F(x, y; r) has continu- 
ous partial derivatives of the (m+-1)st order in D,(*). Hence it follows by a 
simple induction that if the summable function f(x, y) satisfies (8) at each 
point (xo, yo) in D, for each P, (xo, yo; 7; 0) lying in D, then f(x, y) has continu- 
ous partial derivatives of all orders in D. 
Accordingly, we may use the finite Taylor expansion 


+ vo + 0) = yo) + 1] + o(f*), 


= + 9, 
for f(x, y), about the point (xo, yo); here 
4 
Ox oy 
is a differential operator, the partial derivatives are evaluated at (xo, yo), and 
o(r*) denotes a function (not always the same function) such that 


r0 


The side s,, of P»(xo, yo; 7; 0) can be represented by the polar equation 
(9) Sm: p=rsec(@—2xm/n), (2m —1)x/n 50 S (2m + 1)x/n; 


(7) In the proof we use only the weaker assumption that the difference of the two members 
in (8) is (uniformly) o(r*) in D. 

(*) For a list of the principal properties of averaging functions, see H. E. Bray, Proof of a 
formula for an area, Bull. Amer. Math. Soc. vol. 29 (1923) pp. 264-270. 


i 
4 | 

i 
a 

‘ 

4 

‘ 

if 

‘il 

ad 

al 

. 

my 


234 E. F. BECKENBACH AND MAXWELE READE 


that is 
Smt p=rsecy, =0—2xm/n, —x/n S 


Hence we have 


ff + &, vo + n)dtdn = | Pa(xo, yo; 7; 0) | f(x0, yo) 
Pa (Zo, 


tin reecy 2 1 2 
+> f f {[ 00s (¥+ 
J —x/n/ 0 km R! n 


+ p sin (v + pdeay + o(r‘). 


Applying (4), (5) and (6), we obtain 


Sf. See + &, yo + n)dtdn =| Pa(xo, yo; 7; 0) | f(x0, yo) 
(10) n (Zo, 


nr* 
+ tan* — + tan yo) + o(r*), 
8 \3 n n 


3? 
A= —-+— 
Ox? = ay? 
is the Laplacian operator. The lemma now follows from (8) and (10), which 
yield the equation Af(x, y) =0. 
1.4. If f(x, y) is harmonic in a finite domain D, then f(x, y) may be ex- 
panded in a Fourier series about each point (xo, yo) in D: 


f(x, y) = f(xo + p cos 6, yo + p sin 8) 


1 x 
= yo) + p*(ax cos kO + by sin 
k=l 


Lemna 2. If f(x, y) ts harmonic in a finite domain D, then for each P,.(xo, yo; 
r; 0) such that D(xo, yo; r sec m/n) is in D we have 


+ &, yo + n)dtdn =| Pa(xo, yo; 7; 0) | f(x0, yo) 
Py 


(12) 


> tin 
+n f sec**+24y cos k 


Proof. Using (7), (9) and (11), we obtain 


[March 

where 


MEAN-VALUES AND HARMONIC POLYNOMIALS 


1943] 


ff f(xo + &, yo + n)dtdn =| Pa(xo, yo; 7; 0) | f(x0, yo) 
70, 


rin rsecy © 
0 


k=l 


Now (12) follows from the fact that sin @ is an odd function, while cos @ and 
sec 6 are even functions, of @. 


2. AREAL MEAN-VALUES 


2.1. The real and imaginary parts of (x+iy)* are basic homogeneous har- 
monic polynomials of degree m in the variables x, y. We shall denote these 
polynomials by Hi,,(x, y) and He2,,.(x, y), respectively. Any homogeneous 
harmonic polynomial of degree m in x, y is of the form 


aH, n(x, ¥) + 


where a and 0 are constants. 


2.2. THEOREM 2. If f(x, y) is superficially summable in the interior of a 
finite domain D, and if n is a fixed integer, n=3, then a necessary and sufficient 


condition that for each point (xo, yo) in D, the equation , q 
1 

(13) 90) = ff (0 + & yo+ n)dtdn 
| P,,( xo, Yo; 7; 0) | ‘ 

hold for each P,(xo, yo; 7; 0) in D is that f(x, y) be a harmonic polynomial of 
degree at most n, of the form 4 


(14) f(x, y) Bo + > y) + By y)] + y), 


where A; and B, are constants, k=0,1,+-+-,n. 


NEcEssITY. If (13) holds, then, by Lemma 1, f(x, y) is harmonic; conse- 


quently, by (13) and Lemma 2, we have i 
prin 

(15) > f sec**+*y cos knydy = 0. 4 
ket kn-+2 J 


Since (15) holds for all sufficiently small r, the coefficient of each power of r 
must vanish; in particular we have | 


ain 
(16) f sec"**y cos = 0. if 
n + 2 —z/n i 


Now 


ain 
f sec"*2y cos mydy < 0, 


235 
== 
kel 
‘ 
il 


236 E. F. BECKENBACH AND MAXWELL READE [March 


as one readily sees by an inspection of the graphs of the functions sec y and 
cos my. Therefore (16) yields 


1 
» Xo, = 0. 
* 


But (13) holds at each point (xo, yo) in D, so that 
a"f(x, ¥) 
Ox” 
holds throughout D. By a simple induction we obtain, from (17) and the 
Laplace equation Af(x, y) =0, 


x, 
Now (17) and (18) imply (14). 
SUFFICIENCY. If f(x, y) is of the form (14), then f(x, y) can be continued 
harmonically so as to be defined and harmonic in the entire x, y-plane. Now 


(13) follows from (12), (17) and (18). 

2.3. THEOREM 3. If f(x, y) is superficially summable in the interior of a 
finite domain D, if nis a fixed integer, n= 3, and if oo 1s fixed, 
then a necessary and sufficient condition that for each point (xo, yo) in D, the 
equation 


1 
(19) I (xo, Yo) P (x0 7; $0) | ff + é, Yo + n)dédn 


(Zo, 


hold for each P,(xo, Yo; 7; o) in D is that f(x, y) be a harmonic polynomial of 
degree at most n, of the form 


(20) f(x, y) = Bo+ y) + y)] 


+ By[Hin(x, y) sin ndo + H,n(x, y) cos 


where A; and B, are constants, k=0,1,---,n. 
Proof. If we make the transformation of coordinates x’-+-iy’ =(x-+iy)e%, 
then Theorem 3 follows at once from Theorem 2. 


2.4. THEOREM 4. If f(x, y) is superficially summable in the interior of a 
finite domain D, and if n is a fixed integer, n=3, then a necessary and sufficient 
condition that for each point (xo, yo) in D, the equation 


1 
0, Yo) = 


1943} MEAN-VALUES AND HARMONIC POLYNOMIALS 237 


degree at most n—1(°). 


Proof. The theorem follows from Theorem 3 and the fact that while the 
class of harmonic polynomials of the form (20) is not invariant under rotations 
of the plane, the class of harmonic polynomials of degree less than 1 is in- 
variant under these rotations. 


3. PERIPHERAL MEAN-VALUES 


3.1. THEOREM 5. If f(x, y) is summable on each P,(x, y; 7; 0) and on each 
Pa(x, y; 7; 0) Lying in a finite simply-connected domain D, where n is a fixed 
integer, n=3, then a necessary and sufficient condition that for each point 
(xo, Yo) in D the equation 


(21) yo) = 


1 


| pn(Xo, Yo; 0) | Pn (Zo. 
hold for each pa(xo yo; 7; 0) in D ts that f(x, y) be a harmonic polynomial of the 
form (14)(*°). 


NEcEssITY. If we multiply both members of (21) by | Pn(xo, voir; 0)| and 
integrate with respect to 7, and then apply the theorem of Fubini to the super- 
ficially summable function f(x, y), we obtain (13) for each point (xo, yo) in 
D, for each P,,(xo, yo; 7; 0) lying in D. Hence, by Theorem 2, f(x, y) is of the 
form (14). 

SUFFICIENCY. If f(x, y) is given by (14), then, by Theorem 2, (13) holds. 
Differentiating both sides of (13) with respect to r, we obtain 


1 
+ & vo + n)ds 


| Pn(xo, yo; 7; 0) | Pn (29+ 
1 


+ &, yo + n)dtdn, 
| 7; 0) | yo + n)dédy 


which, with (13), implies (21). 
3.2. The following two theorems are analogous to Theorems 3 and 4, 
respectively. 


’ THEOREM 6. If f(x, y) is summable on each P,(x, y; 7; 0) and on each 
Pn(xXo, Yo; 7; Go) Lying in a finite simply-connected domain D, where n is a fixed 
integer, n= 3, and where do is fixed, then a necessary and suffi- 
cient condition that for each point (xo, yo) in D, the equation 


1 
23 0» - f 0 + ’ + d 
xe | Pn(Xo, Yo; 7; go) | Pn (£0, $0) 


(*) Cf. Walsh, loc. cit., p. 923, Theorem 3. 
(!°) Theorems 5-7 actually hold for 2 =2. We have stated them for »23 to conform with 


the analogous Theorems 2-4, respectively. 


I (xo + é, Yo + n)ds 


(22) 


hold for each P,,(xo, yo; r; &) in D, ts that f(x, y) be a harmonic polynomial of 


| 
i} 
| 
|_| 
if 
ii 
aq 
| 
ft 
th 
i] 


238 E. F. BECKENBACH AND MAXWELL READE 


hold for each pa(xo, Yo, 7; Go) in D ts that f(x, y) be a harmonic polynomial of the 
form (20). 

THEOREM 7. If f(x, y) is summable on each P,(x, y; r; @) and on each 
Da(x, ¥; 7; b) lying in a finite simply-connected domain D, where n is a fixed 
integer, n2=3, then a necessary and sufficient condition that for each point 
(xo, yo) in D, the equation 

1 


| Pn(Xo, yo; 7; | Pn (29, 


+ & yo + n)ds 


I (0, Yo) = 


hold for each p,(x, y; r; &) in D is that f(x, y) be a harmonic polynomial of 
degree at most n—1. 

Theorems 6 and 7 follow from Theorems 3 and 4, respectively, in the 
same way that Theorem 5 follows from Theorem 2. 

3.3. Theorems 5, 6 and 7 are analogous to Theorems 2, 3 and 4, respec- 
tively. Similarly, we might give three theorems of the type of Theorem 1 which 
are analogous to Theorems 2, 3 and 4. We give the explicit statement of only 
the last of these: 

THEOREM 8. If f(x, y) is continuous in a finite simply-connected domain D, 
then a necessary and sufficient condition that for each point (xo, yo) in D, the 
equation 


1 


| Pn( Xo, yo; 7; ¢) | Pn (29, 


+ vo + n)ds 


1 
| Pa(xo Aart yo + n)dtdn 


hold for each P,,(xo0, yo; 7; &) lying in D, is that f(x, y) de a harmonic polynomial 
of degree at most n—1. 


THE UNIVERSITY OF MICHIGAN, 
Ann Arbor, MICH. 

Tue Onto STATE UNIVERSITY, 
Co._umBus, OHIO 


+s 
| 


THE CONVERSE OF THE FATOU THEOREM FOR 
POSITIVE HARMONIC FUNCTIONS 


BY 
LYNN H. LOOMIS 


1. Introduction. Let v(z) =v(re*) be a function harmonic in the unit circle 
|s| <1 and admitting there the Poisson-Stieltjes representation 


1) (res) = av(6) 
( 1 + r? — 2r cos (0 — ¢) 
where V(@) is of bounded variation over 056327. The Fatou theorem(?), in 
one form, has the following to say about the relation between v(z) and V(@) in 
(1): 

THEOREM A. If — V(0—2))/2¢ exists, then V(re**) 
— V1 (8) as r—1. 


THEOREM B: If the derivative V'(@) exists, then v(z)—>V'(@) as ze along 
any chord of | z| <1 (hence along any “non-tangential path” or “in angle”). 


The converses of these theorems are in general not true. If v(z) is positive 
however, both converses can be proved. One result is that if v(re*) is a 
bounded function harmonic in | z| <1, and if its boundary function v(@) is 
defined as the limit, wherever it exists, of v(z) as ze” “in angle,” then v(@) 
is a summable function which is precisely equal to the derivative of its in- 
definite integral. The converse of Theorem A for positive functions follows 
readily from known theorems, and it is the main object of this paper to deduce 
from it a strengthened form of the converse of Theorem B for positive func- 
tions. 

We shall have occasion to use the theorem(?) that a harmonic function 
has the representation (1) if and only if it can be written as the difference of 
two non-negative (or two positive) harmonic functions. In particular, every 
positive harmonic function has the representation (1) with V(@) increasing. 

2. The converse of Theorem A for positive functions. It will be simplest 
to infer the converse of Theorem A for positive functions from a series of re- 
marks. 

(i) The limit (if it exists) Vi)(0) =lims.o — V(@—2) ]/2¢ is known 
as the generalized symmetric derivative of V(@). 


Presented to the Society,*»May 3, 1941 under the title A converse to the Fatou theorem; re- 
ceived by the editors May 12, 1942. 

(*) Fatou’s original paper is in the Acta Math. vol. 30 (1906) pp. 335-400. 

(?) See Evans, Logarithmic potential. Discontinuous Dirichlet and Neumann problems, Amer. 
Math. Soc. Colloquium Publications vol. 6, 1927, p. 48. 


239 


4 
i 
it 
M 
| 
| 
il 
i} 
| 
| 
} 
i 
i 
il! 
bi 
| 
ti 
| 
| 


240 L. H. LOOMIS , [March 


(ii) If }>(a, cos n6+6, sin n@) is the Fourier-Stieltjes series for d V(@), then 
v(r*) cos sin is the Fourier series expansion for v(z), so 
that the existence of the limit lim,.,; v(re*) is equivalent (by definition) to 
the Abel summability (summability A) of the series },(a, cos n0+0, sin n0). 

(iii) It is a well known theorem(*) that a series is summable (C, n+1) if it 
is summable A and if its mth Cesaro means are positive. 

(iv) It is elementary that the (C, 1) means of the Fourier-Stieltjes series 
of a non-decreasing function are positive(‘*). 

(v) By a theorem of Hardy and Littlewood(®), summability (C, a) with 
a >0 for the Fourier-Stieltjes series of a non-decreasing function implies sum- 
mability (C, 8) for every 8>0, and is equivalent to the existence of the gen- 
eralized symmetric derivative V,1)(@). 

The converse of Theorem A for positive functions follows directly from 
these remarks. 

3. The Poisson-Stieltjes integral in the half-plane. For some purposes it 
is convenient to work with integral representations in the half-plane rather 
than in the circle. A function u(x, y) harmonic in the half-plane y >0 admits 
the Poisson-Stieltjes integral representation 

2 

+ (é x)? 
where U(t) is of bounded variation over the closed infinite interval [— ©, © ], 
if and only if the transformed function v(w) obtained by mapping the half- 
plane (by z=i (1—w)/(1+w)) onto the unit circle | w| <1 has the Poisson- 
Stieltjes representation (1), where U(tan 0/2)=V(@)/2. Note that (2) is 
not actually an improper integral, for the integrand is continuous over 
the closed infinite interval [— 2, ©] and U(#) is of bounded variation 
there; also note that U(#) may have a jump at infinity. Obviously U’(?) 
= V’(2 arc tan #)/(1+¢*) when either derivative exists; thus U’(0);= V’(0). 
We can rewrite (2) by removing the jump of U(#) at infinity as ky and writ- 
ing Ui(t)= Then (2) becomes 


y 


1 
3 u(x, y) = ky+— —__——— 
(3) (x, y) = ky 
The integral is absolutely convergent and the kernel is simpler than the 
kernel of (2). Also U{(t) = V’(2 arc tan #) so that the Fatou theorem is gen- 
erally valid. On the other hand, U;(t) is not of bounded variation in the in- 


dU 


(*) See Kogbetliantz, Sommation des séries et intégrales divergentes par les moyennes arith- 
métiques et typique, Mémorial des Sciences Mathématiques vol. 51 p. 40, Theorem 21. 

(*) See Titchmarsh, Theory of functions, p. 412. It is only necessary to replace the Lebesgue 
integral by a Stieljes integral in the equation for on. ’ 

(®) See Zygmund, Trigonometrical series, p. 263 and p. 266, Example 11. 


i) 


1943] CONVERSE OF THE FATOU THEOREM 241 


finite interval. We shall find it convenient to use (2) rather than (3), and ad- 
just the mapping of | w| <1 onto y>0 so that any desired boundary point 
maps to the origin z=0 where the desired Fatou relation holds. 

If U,(¢) is absolutely continuous with derivative u(t) then (3) becomes 


+ (é x)? 
If the original function v(w) admits the ordinary Poisson representation 
1— rr’? 


1 


u(#)dt. 


1 
(4) u(x, y) = ky + 


the transformed function has the representation (4) with k=0 and u(é) 
=v(2 arc tan ?#). 

For the purposes of this paper, the factor 1+/* in the numerator of the 
integrand of (2) may be dropped. We are interested in limiting behavior as z 
approaches the origin along rays (rx, ry), 0<r31, y>0. But 


We consider an angle space by restricting x to —x9<x<xo. The integrand 
of the right member of (6) is bounded over —x)<x<xo, —© SiS, so 
that the absolute value of the integral is bounded by rMV (where V is the 
variation of U(¢)). Thus this term approaches 0 uniformly as 2 approaches the 
origin in any angle space, and we can disregard it. We have left to consider 
the harmonic function, again denoted u(x, y), 


which can be written 


(8) u(rx, ry) = = 2 — 
The Fatou Theorems A and B follow at once from (8). The assumption 
of Theorem A is that [U(t)—U(—#)]/2t-U,(0) as t-0. Thus [U(rt) 
— U(—rt)]/r=2t(U) (0) +#R(rt) where | R(t)| is bounded, say by M, and 
| R(t)| +0 as 40. For x =0, (8) becomes 


r 


and if we substitute the above expression in (9) and integrate the last term 
by parts, we have 


4 

| 

| 

| 

i 

| 

| 


L. H. LOOMIS 


Hy 


2 
= Ua(0) +— f R(rf) sin® Bap, 
0 


where 8 =arc tan t/y. For r small enough, | R(rt) | is arbitrarily small over as 
large a part of the range of ¢, and hence of 8, as desired. Over the remaining 
part of the range of 8 the integrand is bounded by M. Therefore the integral 
approaches 0, and U,)(0) =lim,.,; 4(0, ry) which is the conclusion of Theorem 
A. 
Theorem B can as easily be inferred. We assume U(0)=0, and have 
U(rt)/r =tU'(0)+#R(rt), and instead of (10) this gives 
ut — x)y 

Now | (t—x)?)| <K over —x9<x<xo9, — © StS @, and the ab- 
solute value of the integral is therefore bounded by 

2K 

— | R(rt) | dB, 


2 
u(rx, ry) = U'(0) + —f R(rt) 


which approaches 0 with r as in the proof of Theorem A. 
If one will compare these proofs with the corresponding proofs carried out 
in the unit circle(*), the advantages of the half-plane representations will be 


appreciated. 

Using the representation (7) the converse of Theorem A for positive func- 
tions can be deduced immediately from the following integral Tauberian 
theorem of Hardy and Littlewood(’). 


THEOREM. Let f(t) be positive, and suppose that f(t)/(t-+x)* E L(0, ~) for 
some (and so for all) x >0. Suppose that 


f(é) 
o (¢+ x) 


as x—> (as for 0<a<p. Then 
HT (p) 
I'(o)I'(p — + 1) 


-f S(u)du ~ 
as (as t-0). 


The statement of the theorem can be modified to include Stieltjes integra- 


(°) See Evans, loc. cit., pp. 39-43. 
(7) Hardy and Littlewood, On Tauberian theorems, Proc. London Math. Soc. (2) vol. 30 


(1930) p. 25. 


242 . [March 


1943] CONVERSE OF THE FATOU THEOREM 243 


tion, and then only elementary changes of variable are required to put the 
theorem in a form directly applicable to the Poisson integral for the half-plane. 
We shall have occasion to use the following theorem(*). 


THEOREM. Let U(t) have a jump m at t=0; thus m=0 is equivalent to the 
continuity of U(t) att=0. Then as z=x+iy approaches the origin along the ray 
x=Ry, yu(x, y) approaches the value m/(1+-k*)r. In particular, U(t) is con- 
tinuous at t=0 if and only tf yu(x, y) approaches 0 along some ray (and hence 
along all rays). 


In proof we consider 
2 


(ry)u(rx, ry) = dU (rt). 


Now as U(rt)—>V(t) where V(t) = U(0+) for 0<t<o, V(t)=U(0—) 
for — © <t<0, and V(+ ©) = U(+ @). Then by theHelly-Bray theorem con- 
cerning the convergence of sequences of Stieltjes integrals(®), 


1 x y? my? 


Since x =ky the theorem follows. 
4. The converse of Theorem B for positive functions. We shall prove the 


stronger theorem: 


THEOREM 1. Let u(z) =u(x, y) be a positive harmonic function in the upper 
half-plane y >0, and having therefore the representation (2) with U(t) increasing. 
If lim u(z) =u(0) as 2 approaches the origin along each of two rays, than U’(0) 
exists and equals u(0). 

We shall carry through the proof in a number of steps, using (7) instead of 
(2). 

(i) If u(z)—>u(0) as z—-0 along each of two rays then u(z)—x(0) uni- 
formly as z—0 between the rays. If the angle space between the rays is opened 
up to a half-plane by a power w=e‘z* we obtain from u(z) a new positive 
harmonic function u;(x, y) continuous in the closed half-plane y20 except 
possibly at the origin, and. having a boundary function u(t) which is con- 
tinuous at the origin if it is defined there to have the value u(0). Since 
ui(x, y) is positive it admits the representation (7) where U;(¢) has the con- 
tinuous derivative u;(¢) when t¥0. Thus Ui(¢t) is absolutely continuous if 


(*) Fejér, Uber die Bestimmung des Sprunges der Funktion aus ihrer Fourierrethe, J. reine 
angew. Math. vol. 142 (1913) pp. 165-166. See also Warschawski, Bemerkung zu meiner Arbeit: 
Uber das Randverhalten der Ableitung der Abbildungsfunktion bei konformer Abbildung, Math. 
Zeit. vol. 38 (1934) p. 682. 

(*) See D. V. Widder, The Laplace transform, p. 31. 


i 

| 

| 

} 

| 

| 

| 

| 

i 

i 


244 L. H. LOOMIS ., [March 


it has no jump discontinuity at the origin. But by the theorem at the end of 
the last section, U(¢) is continuous at ‘=0, which implies successively that 
ru(rx, ry)—0 as r-0 for every point (x, y) in the upper half-plane y>0, that 
similarly rui(rx, ry)—0, and finally that U,(¢) is continuous at the origin. 
Therefore U;(t) is absolutely continuous and u;(x, y) has the representation 


y) = — = 


+ (¢ — 2) 


The continuity of u,(¢) at the origin implies the continuity of u:(x, y) at the 
origin, which proves the assertion (i). 

(ii) Next, u(z)—>u(0) as z-0 along any ray to the origin. Let /; and J, be 
the two given rays and suppose that /, makes the positive angle a with /,. Let 
l; make the angle a with /,. (we suppose that /; lies in the upper half-plane). 
We shall show that if /:(rx, ry) is any ray between /, and /3, then u(rx, ry) 
—u(0) as r-0. We need only to open up the angle space between /, and J; 
as in (i); /, becomes the ray perpendicular to the axis at the origin. The new 
positive harmonic function u(x, y) has the representation (2) and by hy- 
pothesis (a) ry)—>u(0) as r-0, (b) is absolutely continuous for 
t>0, and u(t) = Uj(t)—>u(0) as t-30 (from the right). Thus U;(¢)/t—u(0) as 
t—0 from the right. Application of the converse of Theorem A for positive 
functions shows that U;(t)/t-—>u(0) as t-0 from the left. Thus U{(0) exists 
and equals u(0). By-the Fatou theorem u:(rx, ry)—u(0) as r-0 for every 
(x, y) with y>0. This proves the statement about rays between /, and J. 
The assertion (i) together with a finite number of applications of the above 
process, proves (ii). 

(iii) It remains to prove from these facts that U’(0) exists and equals 
u(0). By the converse of Theorem A it is sufficient to prove that U(t)/t—>u(0) 
as t—0 from the right. The obvious device is to open up an angle space hav- 
ing the positive real axis as one of its bounding rays. But it is then somewhat 
difficult to establish the relation between the functions U,(¢) and U(¢#) for 
positive ¢. We shall proceed differently. Integrating (7) by parts we have 


(11) u(x, y) = 


U(éjdt. 


By (ii) 
(12) u(0, ry) = dt — u(0) asr— 0. 


We can assume that U(0) =0, and the integrand in (12) is accordingly non- 
negative. We can therefore integrate and invert the order of integration, 


giving 


CONVERSE OF THE FATOU THEOREM 


f ry)dr = (f ir) 


In particular, U(r)/r is integrable over every finite interval. We can now per- 
form the same operation on (11), justifying the change in the order of integra- 
tion by absolute integrability. Thus 


u(x, y) = f ry)dr = ar) at 


and integrating by parts, 


y 
u(x, y) = — 
+ (é x)? t 
The function ;(z) =,(x, y) is obviously harmonic and positive in the upper 
half-plane, and u:(z)—>u(0) as z-0 along any ray (rx, ry), y>0. We now em- 
ploy the device suggested at the beginning of (iii). Let /; and /, make angles 
a (a<2/2) and 2a with the positive real axis, and apply the transformation 
The harmonic function ;(z) has the boundary function u;(¢) = U(#) /t. 
After the transformation, the new harmonic function u2(w) has the boundary 
function u2(t) = u:(é/*). In the Stieltjes form U3(t) =ue(t) and by definition 
U,(t) /t—>u(0) as t-0 from the left. By the converse of Theorem A for positive 
functions 
t 


f tu2(s)ds u(0) 


as t—>0 from the right. The following lemma is due to Landau(?*). 

If xf’(x) increases with x and f(x)~x* (a>0) as x0, then f'(x)~ax* 
as x—0. 

Here 


= f u2(s)ds ~ tu(0) 


as and éf’(t) =tu2(t) which increases with ¢. Therefore by 
Landau’s lemma, u2(#)~xu(0) as t-0, that is, U(t)/t--u(0) as from the 
right. We now apply the converse of Theorem A again to obtain U(t)/t—u(0) 
as t—0 from the left. Thus U’(0) exists and equals u(0) ,and the proof of the 
theorem is complete. 


(?*) E. Landau, Bettrage sur analytischen Zahlentheorie, Rend. Circ. Mat. Palermo vol. 24 
(1917) pp. 81-160. 


245 
1943] 


246 L. H. LOOMIS : [March 


It should be remarked that the direct converse of Theorem B for positive 
functions can be proved from considerations of the integral representation 
(7) without any reference to. the converse of Theorem A. 

5. A counterexample. In this section we shall show by a counterexample 
that neither of the converses of A and B is true for the general representa- 
tion (2). We first define the function U(¢) and then define the harmonic func- 
tion u(z) by the representation (7). The graph of U/(t) will consist of a se- 
quence of triangular peaks separated by intervals of the t-axis and converging 
to the origin, the vertices of the peaks lying on the line s =¢ over the points 
t=2-", the slopes of the sides of the peaks to be determined by later con- 
siderations. Such a function U(t) is clearly of bounded variation. 

We thus define U(#) as follows: 


0, 

0, i> 1 

(1/2)*, = (1/2)*, » = 1,2,--- 

0, = (1/2)" + a,, 0S a, S$ 2-7, 

U(t) linear on 2-"—a,StS2-* and on 2-*St3S2-*+<a,, and U(t) =0 else- 


where in 0<#<1. The a, are positive numbers to be chosen later subject to 
the restriction noted above. On 2-*—a, <t<2-", dU(t) =2-"dt/a, so that 


= 


u(x, y) gu) 
F y? + x)? 


1 ed 1 y 
dl 


| y 
f 


2-*+an 1 [ y 
2*anLy? ++ (¢— x— Gn)? + (¢ — 2)? 
2(¢ — x) — an 


dt. 
2” [y? + (¢ — x — a,)*][y? + (¢ — 2)*] 


Thus 


| ~ 2" [y? + (¢ — — + (¢ — 


Consider the term 
y 2(¢ — x) — a 
12 [y?+  — + 2)*] 


ay 


1943] CONVERSE OF THE FATOU THEOREM 247 


on a ray x=ky. If we allow x, y, ¢ and a; to vary, subject to the restrictions 
0S4,51/8, x=ky ,the term has a maximum value 
M;,. By homogeneity the general term 


y 2(¢ — x) — a, 

2" + (¢ — — + (¢ — 
with 0 Sa, 2-"—a, St S2-"+<a,, x =ky has the maximum value 2*M,. 
Now choose the constants a, as 2~**. Then a, times the general term above is 
bounded by 


It is clear that the general term approaches 0 as y approaches 0 (x =ky) 
uniformly over the allowed range of ¢. Given e, choose N so that 


2-"M; < 2, 
n=N+1 

and choose yo so that for y<yo and x=ky, the sum of the first N terms is 
bounded in absolute value by ¢/2. Thus for y <yoandx=ky, | u(x, y) | <e,and 
we have proved that u(x, y)—>0 as z—0 along any ray to the origin. It is 
obvious however that U(t)/t oscillates between 0 and 1 as ‘0 from the right, 
and that [U(¢)— U(—1#)]/2t oscillates between 0 and 1/2 as t->0 from the 
right. 

6. Generalizations and applications. The procedure of Theorem 1 is ade- 
quate for situations more general than that described there. Suppose, for 
instance, we have not that u(x, y) approaches u(0) but that 


f u(rx, ry)dr 
0 


exists and approaches u(0) as z=x+dy approaches the origin along each of 
two rays /, and /, We open up the angle space as before and get a positive 
harmonic function u;(x, y) admitting the representation 


1 y 
t)dt 
u(x, y) f 
with the hypothesis that 


(13) face | rt |8)dr u(0) 
0 


as r—0, two separate statements being understood. Here 8=2/a where a is 
the angle between /; and /,. We now need the following lemma: 


Lemma. If u(t) is a positive function such that for some a>0 


f j u(t*)dt ~ ks 


| 


248 L. H. LOOMIS 


as s—0, then 
(14) f u(t)dt ~ ks. 
0 


In proof we multiply both sides by s*-* and integrate from 0 to r. If we 
then integrate the left member by parts the conclusion follows. 

If rt is replaced by ¢ in (13) and the lemma applied, we obtain as a con- 
clusion precisely the hypothesis of the Fatou Theorem B for the harmonic 
function u,(x, y) (with U(s) absolutely continuous and equal to the integral 
(14)). Therefore u:(z)—>u(0) as 2-0 “in angle,” which is equivalent to the 
statement that u(z)—>u(0) as z-0 along any path between /; and /,. We can 
therefore apply Theorem 1 to infer that U’(0) exists and equals u(0). Also 


1 1 r 
f u(sx, sy)ds = —f u(t cos 6, ¢ sin 6)dt 
r/o 


where z=x+iy=r(cos 0+ sin 0), and the new hypothesis is thus that the 
integral Hélder mean approaches u(0). We have thus proved the following 


theorem: 

THEOREM 2. Let u(z)=u(x, y) be a positive harmonic function in the upper 
half-plane y >0, and having therefore the representation (2) with U(t) increasing. 
If u(z) has the (H, 1) limit u(0) as z=x+iy approaches the origin along each 
of two rays, then U'(0) exists and equals u(0). 

CoROLLARY 1. As a consequence of the Fatou theorem u(x, y) has the ordinary 
limit u(0) as z=x-+iy approaches the origin along any ray of the upper half- 
plane. 

CorROLLary 2. If u(z) has the (H, n) limit u(0) as 2 approaches the origin 
along each of two rays, then U'(0) exists and equals u(0). 


This is a trivial consequence of the Landau lemma used in §4, for if u(¢) 


is positive, and 
1 t 
f —f u(r)dr ~ su(0) s— 0, 
# 
then direct application of the lemma gives that 
t 
f u(r)dr ~ tu(0). 
0 


A finite number of such steps reduces the hypothesis of the corollary to that 


of Theorem 2. 
Again, it is clear that we have used the monotonicity of U(¢) (positivity 


[March 


1943] CONVERSE OF THE FATOU THEOREM 249 


of u(x, y)) only locally about the origin, and the hypothesis can be accord- 
ingly weakened to that extent. 

It is interesting to see what can be said for other statements of the Fatou 
theorem. A somewhat stronger form of the theorem than that contained in 
the first section concerns the ordinary Poisson integral representation 

1 — r? 


1 
1 + r? — 2r cos (6 — ¢) 


Let vo(z) =dv(re) /00. Then ve(z) is a harmonic function in | z| <1 which does 
not in general admit a Poisson-Stieltjes integral representation. 


THEOREM A’. If 140 [v(6+¢) —v(@—2) ]/2t exists, then v9(re*) 
— (0) as r—1. 


THEOREM B’. If the derivative v'(@) exists, then v9(z)—>v'(0) uniformly as 
z—e “in angle.” 


The proofs are essentially the same as those for Theorems A and B. As 
before, we must impose some further restriction on v(@) in order to deduce the 
converses of Theorems A’ and B’, and we try the local condition that, 
v(0) —v(8o) change sign at @ that is, that [v(@)—v(00) ](@—@0) be of constant 
sign (admitting the value 0) in some neighborhood of @). We may obviously 
take v(@.) =0. Thus in the half-plane our hypotheses are that u(x, y) has the 


representation 
(x, y) f (t)dt 
u(x, y) =— dt, 


where the integral is absolutely convergent, that ¢u(t) 20 in some neighbor- 
hood of the origin, and the du(x, y)/@x=u,(x, y) has the property that 
uz(rx, ry)—l as r-0 for every (x, y) with y>0. For the converse of A, the as- 
sumption holds only along the ray x =0. Now 


_1  2y(¢ — 2) 
u(x, y) = f u(t)dt, 


ae 2yt u(rt 
u.(0, ry) = — f dt 


[y? + r 


The integrand here is positive in the neighborhood of r=0, t=0, and since 
uz(0, ry) is integrable over 0<r<1 we have 


f «0, ry)dr == 


and in general 


and 


L. H. LOOMIS 


+ (¢ — t 


Since u(t)/t is non-negative in some neighborhood of t=0, we can apply 
Theorem 1 with u(t)/t= U’(t) to obtain 


1 t 

(a) — f u(r) — u(— r) poe 
t 0 2r 

as ‘0, as the converse of Theorem A’, and 


1 * u(r) 
(b) dr—l 


as t—0, as the converse of Theorem B’. 

To obtain the symmetry of the earlier case we should now prove that (a) 
and (b) can be taken as weakened hypotheses for the Fatou Theorems A’ and 
B’. This is in fact the case, but we shall omit the proofs here since they are 
essentially the same as the proofs of Theorems A and B. 

The relation between vp(w) in the unit circle and u,(x, y) in the half-plane 
can be easily established. They are different functions even when transformed 
so as to have the same domain of definition, but they have the same asymp- 
totic properties at the origin in the half-plane. 

If f(z) is a bounded analytic function in the unit circle |z| <1 it is, 
known(") that if lim f(z) exists as z approaches a boundary point e along 
some curve, then f(z) has that limit as ze “in angle.” Thus there is no 
difference between situations A and B in this case. The Fatou theorem im- 
plies(*) that lim,.; f(re®) exists for almost all 6, and the converse of the 
Fatou theorem (Theorem 1) implies that if f(@) is the boundary function thus 
defined then f(@) is precisely equal to the derivative of its indefinite integral. 


(") See Nevanlinna, Eindeutige Analytische Funktionen, p. 65. 
See Bieberbach, Funktionentheorie, pp. 147-148. 


HARVARD UNIVERSITY, 
CAMBRIDGE, Mass. 


ON BOUNDED VARIATION AND ABSOLUTE CONTINUITY 
FOR PARAMETRIC REPRESENTATIONS 
OF CONTINUOUS SURFACES 


BY 
PAUL V. REICHELDERFER 


INTRODUCTION 
1. A continuous curve C in xyz-space may be defined by 
C: x= x(u), y= y(u), =z(u), asuss, 


where each of the functions x(x), y(u), z(u) is continuous on the closed interval 
[a, 6]. The following facts are known(") (see Rado [3, chap. I]; Saks [1, 
chap. IV]). 

(1) A necessary and sufficient condition that the length L(C) of C be 
finite is that each of the functions x(u), y(u), 2(u) be of bounded variation on 
[a, 5]. 

(2) If the length L(C) is finite, then each of the derivatives x’(u), y’(u), 
z'(u) exists almost everywhere in [a, b], is summable on [a, 6], and 


L(C)2 f + y/(u)® + 2/(u)?} 


(3) A necessary and sufficient condition that the sign of equality hold in 
this relation is that each of the functions x(u), y(u), 2(u) be absolutely con- 
tinuous on [a, d]. 

2. A continuous surface S in xyz-space may be defined by 


S: x= 2(u,0), y= y(u,v), s=2(u,2), cSvzd, 


where each of the functions x(u, v), y(u, v), 2(u, v), is continuous on the closed 
(two-dimensional) interval [a, 5; c, d]. How may the concepts for the area 
of the surface S and for bounded variation and absolute continuity of the 
representation of S be defined so that theorems analogous to those for con- 
tinuous curves cited in(?) 1 hold? For the special case in which S may be 
given by relations of the form 


S: £= 4, y=», z= f(u, 2), asusb, cSvsd, 


Presented to the Society, April 17, 1942; received by the editors June 11, 1942. 

(*) Numbers in square brackets refer to papers listed in the bibliography at the end of this 
_ Paper. 

(*) The notation I, 6, 2, for example, refers to chapter I, section 6, relation 2 in this paper 
When no chapter reference is given, the introduction is meant. 


251 


252 P. V. REICHELDERFER’ (March 


Geécze and Tonelli have shown a complete answer to this question (see Geécze 

[1], Tonelli [1]). But for the general case, no satisfactory answer seems to 
be known. 

3. It is the chief purpose of this paper to give an answer to the question 
just raised. In so doing, interesting generalizations and extensions of results 
in the literature will be obtained. Briefly, the program for procedure is the 
following. First, a definition for a continuous surface is made precise (see I, 
1). Now concepts for bounded variation and absolute continuity of a repre- 
sentation for a curve are phrased in terms of the corresponding representa- 
tions for the projections of this curve upon each of the coordinate axes. Here 
the definitions for bounded variation and absolute continuity of a representa- 
tion for a surface will be made in terms of the corresponding representations 
for the projections of this surface upon each of the coordinate planes. For 
representations of the latter type, a hierarchy of definitions for bounded 
variation and absolute continuity is extant(*) (see R? [1]). It will be desir- 
able for the purpose of this paper to review these definitions, and to make 
certain additions to the theory developed in the work just cited (see II, III). 
Next, a definition for the area of a continuous surface will be given (see IV). 
This definition will be compared with that of the Gedcze area as defined by 
Rado (see IV, 17-20), and the Lebesgue area (see IV, 14-15); the latter area 
has been most frequently used in the literature. For the special case considered 
by Geécze and Tonelli, it will be shown that the definitions advanced here 
are equivalent to those which they used (see IV, 16). With the definitions 
for the area of a surface, and for bounded variation and absolute continuity 
of its representations, thus formulated, it will be shown that theorems hold 
for continuous surfaces which are analogous to those cited in 1 for curves 
(see IV, 4, 6-13; V, 9-15). As an application of this theory, some of the results 
of Rado and Reichelderfer on convergence in area for surfaces (see R? [2]) 
will be generalized (see V). 

4. For brevity, the following notations and conventions are adopted. The 
u'u*-plane will serve as a parameter plane; a point (u',u?) in it is denoted 
simply by u. The surfaces will be in x'x*x*-space; a point (x, x?, x*) in this 
space is denoted simply by x. With each point x, there is associated its projec- 
tions on the respective coordinate planes given by 


lg = (0, x*, x’), = (z!, 0, = (z!, x*, 0). 
The planar exterior measure of any set E in the u-plane is denoted by | E| . 


The set of interior points in the set E is denoted by E°. 
If a=(a', a, a*) be any triple of real numbers, set 


A triple of real, finite, single-valued functions x*(u), i=1, 2, 3, each defined 


(*) The symbol R? in this note is to be read “Rado and Reichelderfer.” 


1943] PARAMETRIC REPRESENTATION OF CONTINUOUS SURFACES 253 


on a set E in the u-plane is denoted by [x(u), Z], where x(u) =(x"(u), x?(u), 
x*(u)) for u=(u', u*) in E. With each triple [x(u), EZ] there are associated 
three triples [‘x(u), E] defined by 


= (0, = (x"(u), 0, = (x'(u), x*(u), 0), 
u in 


A triple [x(u), E] is said to possess any property which is possessed by each 
of the x‘(u) for <=1, 2, 3 on the set E£. 

A two-dimensional interval in the u-plane is denoted generically by 
I, 3, [a, B], or [a', B'; a, 6?]; it consists of all points u satisfying a! <u! <A, 
where a! a? A connected open set in the u-plane is termed 
a domain, is denoted generically by D or D; if the boundary of a domain con- 
sists of a Jordan curve, then the closed connected set of points in the domain 
and on its boundary is termed a simple Jordan region, is denoted generically 
by 8, B or %. If the boundary of a domain consists of a finite number of 
Jordan curves, then the closed connected set of points in the domain and on 
its boundary is termed a Jordan region, is denoted generically by ®. A se- 
quence of domains D, is said to fill up a domain D from the interior if each 
domain D, is contained in D, but for every closed set F in D there exists an 
n(F) such that F is in D, for every choice of m exceeding n(F). A sequence of 
Jordan regions &, fill up a Jordan region ® from the interior if their in- 
teriors RQ fill up R* from the interior. 

If S is any simple Jordan region in the u-plane, then a finite system of 
nonoverlapping simple Jordan regions B lying in 8 is denoted generically 
by S(%). The maximum of the diameters of the domains B in S(%) is de- 
noted by ||.S(%)||. If 8 =)>B for B in S(B), then S(¥) is termed a subdivision 
of %. In particular, if each of the simple Jordan regions B in S(%) is an 
interval, then S(%) is termed a finite interval system; if, moreover, 8 is an 
interval and S(%) is a subdivision of 6, then S(%) is termed an interval 
subdivision. 

A b-function defined in % is a law which associates with every simple 
Jordan region B in $ a finite, real number ¢(B); this function is denoted by 
[¢, B]. This b-function is non-negative if ¢(B) is non-negative for every B in 
(%). For a finite system S(B), where B is any simple Jordan region in %, set 


o(S(B)) = for B in S(B); 

U(B; [¢, B]) = Lu.b. ¢(S(B)) for all finite systems S(B). 
Evidently ¢(B) U(B; [¢, B]) [¢, B]) for every choice of B in B. 
Hence if U(%; [¢, B]) is finite, then [U, 8] isa d-function, and [¢, B] is said 
to possess a U-function. If S(%) is any finite system, then clearly U(S(%); 
[¢, B]) U(B; [¢, B)). 


5. If [‘¢, B], i=1, 2, 3, is a triple of b-functions having a common range 
of definition, set 


P. V. REICHELDERFER __., [March 


¢(B) = (1¢(B), *6(B), *6(B)),  (B) =||¢(B)||, for 


If [¢, B] is a triple of non-negative b-functions, then clearly 
3 
< < > for Bin&. 
Hence a necessary and sufficient condition that [6, 8] have a U-function is 
that each member of the non-negative triple [¢, %] have a U-function, and 


1. U(B; [,B]) < U(B; [%, B]) s ve; for Bin®&. 


Elementary considerations lead to the following 


Lemma. Let [bn, %], n=0, 1, 2,--++, be sequence of triples of non- 
negative b-functions for which lim inf ‘¢,(B)=‘o(B) for B in 2, 3. 
Then lim inf ©,(B)=o(B); lim inf U(B; B]) = U(B; B]); lim 
inf U(B; => U(B; B]) for Bin B. 


CHAPTER I 
ON CONTINUOUS SURFACES 


1. A definition for a continuous surface will now be recalled; since this 
definition is in the literature (see Rado [2, 3]), it will be merely sketched here 
for the convenience of the reader, and for the purpose of fixing notation in 
the sequel. Consider the class of all continuous triples [x(u), 8], where 
is a simple Jordan region in the u-plane. Let [x:(u), 81], [xe(u), 82] be any 
two of these triples. Since 8; and ¥B, are simple Jordan regions, there exist 
topological maps of 8, onto B2 given by single-valued continuous pairs 
[a(u), 81] having single-valued continuous inverses on Bz. Let d(#) denote 
the maximum of —x2(i(u))|| for uin By. Put d([x:, [x2, B2]) equal 
to the greatest lower bound of d(#) for all topological maps [#(u), 81] of B1 
onto B2. It is easily verified that the binary relation d thus defined in the 
class of all continuous triples has all the properties of a distance except one: 
the fact that d([x:, 81], [x2, B2]) is zero does not imply that B, and B, are 
identical and x;(u) =x2(u) for u in B,-B2. In order to remedy this defect, 
one agrees that two of these triples [x1, 81], [xe, 82] are in the ~ relation 
provided d([x1, B:], [x2, B2]) =0. It is readily verified that the binary rela- 
tion ~ is an equivalence relation; hence it partitions the class of all continu- 
ous triples into mutually exclusive sets of triples mutually in the ~ relation; 
denote these sets generically by S. It follows that if S; and S; are any two of 
these sets, then d([x:, 81], [x2, B2]) has a value dy: which is independent of 
the choice of [x:, 1] in S; and Bs] in Sz. Set d(S:, Then 
d(S;, S2) has all the properties of a distance in the class of sets S. Each of the 
sets S is termed a continuous surface of the type of the circular disc. Any one 


1943] PARAMETRIC REPRESENTATION OF CONTINUOUS SURFACES 255 


of the continuous triples [x(u), 8] in S is termed a (parametric) representa- 
tion for the surface S. The distance d(.S:, S:) is known as the Fréchet distance 
of the surfaces S; and S:. A sequence of surfaces S, is said to converge to the 
surface So if d(S,, So) converges to zero. If S, converges to So, and if [xo(x), 
%o] be any representation for So, then there exist representations [x,(u), Bn] 
for S, such that %, is identical with 8» for every m and x,(u) converges on Bo 
uniformly to xo(x). 

2. Amongst the representations for a surface S, there may occur one 
[x(u), B] of the form 


1. x(u) = u*)), u = u*) in B. 


Denote by *% the image of % in the *x-plane under the topological map 
x'=x?, for (u', u*) in Then 


= x*), (x!, x?) in *B. 


denotes what is commonly called a non-parametric representation for S. For 
this reason, the representation [x(u), 8] in 1 will be termed a representation 
of non-parametric origin for S. By symmetry, one should also term any repre- 
sentation for S having one of the forms 


a representation of non-parametric origin for S. But this is unnecessary, since 
any of these forms may be brought into form 1 by a suitable change of nota- 
tion. A surface need not have a representation of non-parametric origin, but 
if it does, that representation is unique. 

3. Amongst the representations for a surface S, there may occur one 
[x(u), B] of the form 


1. = u?), 0), u = u*) in B. 


Geometrically speaking, such a surface lies entirely in the *x-plane. It is 
easily verified that any other representation for S must have the form 1. 
Such a surface is sometimes called a flat surface. The representation for S de- 
fines a continuous transformation from the simple Jordan region % in the u- 
plane to a bounded portion of the *x-plane. 

4. Let S be any continuous surface in x-space. If [x(u), 8] be any 
representation for S, then (see 4) [*x(u), 8] is a representation for a flat 
surface *S in the *x-plane, which is the projection of S on that plane. It fol- 
lows at once (see I, 1) that if [#(u), 8] is any other representation for S then 
[#(u), B] is another representation for *S—that is, the surface *S is uniquely 
determined by S. Thus a continuous surface S in x-space determines uniquely 
three flat projection surfaces 1S, 2S, *S on the coordinate planes ‘x, *x, *x, 
respectively. 


256 P. V. REICHELDERFER [March 


5. A real, finite-valued function f(u) =f(u', u?) defined on a simple polyg- 
onal region % in the u-plane is termed quasi-linear if f(u) is continuous in 8, 
and if there exists a triangulation of 8 such that f(z) is a linear function of 
u' and u? on each triangle of the triangulation. A continuous surface is termed 
a polyhedron, and denoted by P, if it possesses a representation [x(u), 8] 
such that % is a simple polygonal region and each x‘(u) is quasi-linear on B. 
Then there exists a triangulation of 8 such that each of the functions x*(u) 
for i=1, 2, 3 is linear on every triangle in the triangulation. The image of 
each triangle in this triangulation is a (possibly degenerate) triangle; the 
sum of the areas of these image triangles is termed the elementary area of 
P—denote it by a(P) Elementary considerations show that a(P) depends 
only on the polyhedron P. 

6. Let S be any continuous surface. Then there always exist sequences 
of polyhedra P, such that P, converges to S; lim inf a(P,) is an upper bound 
for the Lebesgue area A(S) of the surface; A(S) is the greatest lower bound of 
all the upper bounds derived in this way. The Lebesgue area possesses the 
following important properties (see Rado [2, 3]). 

1. If P be any polyhedron, then A(P) =a(P). 

2. There exists a sequence of polyhedra P, such that P, converges to S 
and A(P,) converges to A(S). 

3. The Lebesgue area A(S) is a lower semi-continuous functional of S. 

7. Much of the literature on continuous surfaces restricts its considera- 
tions to surfaces having representations of non-parametric origin (see I, 2); 
in defining the “Lebesgue area”’ for such surfaces, it has been convenient to 
restrict the class of approximating polyhedra P, to have representations of 
non-parametric origin also. Let Ay(S) denote the area of a surface S having 
a representation of non-parametric origin when the class of approximating 
polyhedra is so restricted. Then clearly A(S) <A+(S), and it is important in 
comparing the literature to know that the sign of equality always holds. 
This fact is implicit in the work of Rado (see Rado [1]), but no explicit proof 
seems to be in the literature. Such a proof will be a corollary to one of the 
results in this paper (see IV, 15). 

8. Let S be any continuous surface. The following principle has been 
advanced by Rado and Reichelderfer to direct their work in the theory of 
continuous surfaces (see R? [2]). Assume that some sort of area—denote it by 
cA(S)—is defined for S and that, for each representation [x(u), 8] of S, 
some sort of Jacobians are defined for each of the projection representations 
[x, B], [2x, B], [*x, B] on the coordinate planes (see 4)—denote these ‘by 
1F(u), *F(u),*F(u), respectively, wherever they exist. Then a representation 
[x(u), B] for S is said to be absolutely continuous (A, Ff) provided that each 
of the Jacobians 'F(u), *¥(u), *¥(u) exists almost everywhere in the interior 
B° of B, is summable on 8°, and 


1943] PARAMETRIC REPRESENTATION OF CONTINUOUS SURFACES 257 


AS) = f. [9 + + 


CHAPTER II 
ON CONTINUOUS TRANSFORMATIONS IN THE PLANE 


In sections 1-9, 13, the salient features of the theory of bounded variation 
and absolute continuity for continuous transformations in the plane de- 
veloped by Rado and Reichelderfer (see R? [1]) are summarized. For all 
details, the reader is referred to the cited paper(*). Minor notational changes 
have been made to place the results in a form more convenient for the pur- 
poses of this paper. In sections 10-12, 14-22 extensions of this theory are dis- 
cussed. 

1. Let & be any plane in x-space; on £ choose a rectangular coordinate 
system £', &, and adopt notations similar to those introduced in 4 for the 
u-plane. Let D be any bounded domain in the u-plane. If &(u) =(#(u', u?), 
£2(u!, u?)) be a pair of real, single-valued functions defined, continuous, and 
bounded in D, then [£(u), D] defines a bounded continuous transformation T, 
which associates with every point u in D a point = &(u) in a bounded portion 
of the é-plane. If E be any set in the u-plane, let T(£) denote the set of all 
points & in the &-plane for which there exists a point uo in E such that 
£(uo) = £o. If E be any set in the ¢-plane, let T7-'(E) denote the set of all points 
uo in D such that (uo) is in E. If T1:[£:(u), Di]; Te: [&(u), D2] are two 
bounded continuous transformations, then their distance p(T, 72; E) on any 
set E in both D,: and Dis the least upper bound of || €1(e) — &(u)|| for u in E. 
For any set £ in the u-plane, and for any point & in the £-plane, N(&, T, EZ) 
is defined to be the number (possibly +) of points in the set T—'(&)-£Z. 
For fixed £ and T, N(é, T, E) is a non-negative completely additive set func- 
tion. 

2. If R be any Jordan region in D, and if k be any non-negative integer, 
define R(k, T, R) to be the set of those points & in the plane for which 
there exists a positive number e such that N(£o, Tx, R) =& for every bounded 
continuous transformation satisfying T; R) <«. Clearly R(k, T, R) 
contains R(k+1, T, R) for k=0; 1, 2, -- -. Define 


R(o, T,R) = K(k, T, R). 
k=0 
A function K(£, T, ®) is defined by the relations 


k on R(k, T, — + 1, T,®); 
+o on R(o, T,R). 


7,92) = 4 


(*) The introduction of the R? paper contains a summary of their results, together with 
reference to the location of the proofs. 


258 P. V. REICHELDERFER - [March 


Given a domain D in D let R,, be a sequence of Jordan regions whose interiors 
fill up D from the interior (see 4). For fixed £ and T, the sequence K(&, T, R,) 
has a limit (possibly + ©) which is independent of the choice of the sequence 
of regions ®,, whose interiors fill up D. This limit is denoted by K(é, T,D), 
and is termed the essential multiplicity of — under T with respect to D. It has 
the following properties. 

1. The essential multiplicity K(&, T,D), for fixed T and D, is a lower semi- 
continuous function of &. 

2. IfD, is a sequence of domains filling up D from the interior, if T,: [&,(u), 
D,] is a sequence of bounded continuous transformations such that for every 
closed set F in D it is true that lim p(T,, T; F) =0, then for every & one has 
lim inf T.,D,) =K(é, T, D). In particular, if T, is given by [E(u), then 
lim K(é, Tx =K(é, T, Dp). 

3. For any Jordan region ® in D, it is true that K(é, T, R°) = K(é, T, R). 

3. Aset Bin the u-plane is termed a base set for the transformation T if it 
is measurable, and for every closed oriented square s whose interior s° is in 
®, the set 7(s*-B) is measurable. Let B be any base set for T. Define, for 
any closed oriented square s whose interior s° is in D 


1. . G(s, T, B) =| T(s*-B)|. 


The transformation T is said to be of bounded variation with respect to the 
base set B—briefly, BV B—if there exists a finite positive constant M such 
that for any finite system of nonoverlapping, closed, oriented squares s; 
whose interiors are in D, it is true that 


Dd T, B) < M. 


The transformation T is said to be absolutely continuous with respect to the 
base set B—briefly, AC B—if for every positive number e, there exists a 
positive number 7, such that 


>» Glsis T, B) <e 


for every finite system of nonoverlapping, closed, oriented squares s; whose 
interiors are in D and for which 


si| <n. 
If T is AC 8, then it follows that T is BV 8B. 


Lemma. If B, and By, are base sets for T, then B= B,+ By is a base set for 
T. A necessary and sufficient condition that T be BV ®B is that T be both BV B, 
and BV ®B:. A necessary and sufficient condition that T be AC 8 is that T be 
both AC B, and AC ®,. 


4. A necessary and sufficient condition that a bounded continuous trans- 


— 


1943] PARAMETRIC REPRESENTATION OF CONTINUOUS SURFACES 259 


formation T be BV 8 is that N(é, T, D- B) be summable(®). If T is BV 8B, 
then N(é, T, E-B) is measurable and summable for any open or closed set 
E relative to D. If T is BV 8, then the function of squares G(s, T, B) defined 
in II, 3, 1 possesses a derivative D(u, T, B) almost everywhere in D. This 
derivative is summable in D, and one has, on every open set O in D(°) 


f T, B)du f xe T, O-B)dé. 


If T is AC 8, then the sign of equality holds here. Conversely, if the sign of 
equality holds here for O=9, then it holds for every open set O in D, and T is 
AC 8. If T is AC 8, and if E is any. measurable set in D, then N(£,.7, E- B) 
is measurable and summable; thus if H(£) is a finite-valued, measurable 
function, then H(é)N(é, T, E /B) is a measurable function. Under these condi- 
tions, it is also true that H(é(u)) D(u, T, B) is measurable in D. Finally, 
if the transformation T is AC 8; the function H(é) is finite-valued and 
measurable; the set E in D is measurable; one of H(é)N(E, T, E, 8), 
H(&(u))D(u, T, B) is summable, then both of these functions are summable, 
and 


2: f aco, T, B)du = f zene T, E-B)dé. 


5. Rado and Reichelderfer studied closely the notions of BV Band AC B 
for a certain choice of the base set B which is now described. Let R be any 
Jordan region in D (see 4). Let & be any point in the &-plane not on the 
image under T of the boundary of ®. If C be one of the curves bounding §®, 
then the image of C under T, taken as u traverses C in a positive sense rela- 
tive to ®, is a directed, closed continuous curve C not passing through £o; 
consequently & has a well defined topological index with respect to C. The 
sum of these indices taken over all the boundary curves of ® is denoted by 
u(&, T, R). For points & on the image under T of the boundary of ®, one 
puts T, R)=0. 

6. Let £ be any point in the &-plane. The set 7—'(£) is a closed set rela- 
tive to D, hence decomposes in a unique way into components which are 
maximal connected closed sets relative to D. If a component of T-'(&) has 
a positive distance from the boundary of D, then it is a connected closed set 
in the absolute sense—that is, a continuum; such a component is termed a 
maximal model continuum for £) under T in D, and is denoted generically by 
o(&, T). A o(&, T) is termed essential if in every open neighborhood of 
o(&, J), there is a Jordan region ® containing o(£o, T) in its interior and for 


(®) Since all functions considered in the ¢-plane are zero outside a sufficiently large disc, 
they are termed summable whenever they are summable on such a disc, and no range of integra- 
tion will be explicitly indicated. 

(*) See Footnote 5. 


260 P. V. REICHELDERFER ~ [March 


which y(, JT, R) is not zero. If D be any subdomain in D, then the number 
of essential maximal model continua o(£, T) for & under T in D is equal to 
the essential multiplicity K(£, T,D) (see.II, 2). If R be any Jordan region in 
D for which pu(£o, T, R) is not zero, then £ has an essential maximal model 
continuum under T in the interior of R. 

7. If D be any domain in D, then denote by E(T, D) the set of all points 
uo which belongs to some essential maximal model continuum for &(uo) under 
T in D. Denote by E(T, D) that subset of E(T, D) which consists of all those 
points %» which themselves constitute essential maximal model continua for 
£(uo) under T in D; evidently E(T, D)=€(T, D)-D, but a similar formula 
does not generally hold for E(T, D). If up be any point of E(T, D) which has a 
neighborhood free of points belonging to other essential maximal model 
continua for (uo) under T in D, then it is true that w(&(uo), T, R) has a non- 
zero value independent of the choice of a Jordan region ® in this neighbor- 
hood which contains uo in its interior and whose boundary contains no point 
of T-"(£(uo)) ; denote this value by j(uo, 7). For all points vu» in D not having 
the properties of uo, set j(u*x, T) equal to zero. Then j(u, T) is a Baire func- 
tion in D and | j(u, T)| does not exceed one, except possibly on a denumer- 
able set of points in D. 

8. It is the set E(T, D) which Rado and Reichelderfer employ for a base 
set (see II, 3); it is the set E(T, D) which plays a prominent role in the follow- 
ing theory. Because the results for these two base sets are so closely related, 
the results for the set E(T, D) developed by Rado and Reichelderfer are now 
summarized as a basis for stating and proving results for the set E(T, D). 
Let Ko denote the class of all bounded continuous transformations T: [£(x), 
®] which are BV E(T, D). Let Ki denote that subclass of Ko (see II, 3) con- 
sisting of all transformations T which are AC E(T, D). Let Kz denote the 
class of all transformations T in K; for which the relation N(é, T, E(T, D)) 
=K(£, 7, D) holds almost everywhere. Finally, let K; denote the class of all 
transformations T in Ke for which the ordinary Jacobian 


J(u, T) = u*), = inD, 


exists almost everywhere in D. If T is in Ko, then D(u, T, E(T, D)) exists 
almost everywhere in D and is summable on D (see II, 3)—denote this deriva- 
tive by D(u, T). Define 


¥(u, T) = er T)D(u, T) wherever D(u, T) exists; 


The function ¥(u, T) is termed the generalized Jacobian for the transforma- 
tion T. From II, 7 it follows that | F(u, T )| <D(u, T) almost everywhere in 
D, and F(u, T) is measurable in D; hence ¥(u, T) is summable in ®. If T is in 
the class K¢,-it follows that | F(u, T)| =D(u, T) almost everywhere in ®D. If 
T is in the class Ks, it is true that 7(u, T) =J(u, T) almost everywhere in D. 


0 otherwise. 


1943] PARAMETRIC REPRESENTATION OF CONTINUOUS SURFACES 261 


The class K3 contains all bounded continuous transformations T:[£(u), D] 
which satisfy a Lipschitz condition in the following restricted sense: there 
exists a finite constant L such that if u; and uz are any two points of D for 
which the line segment joining them is contained in D, then || £(ze1) — §(us)|| 
SL||u;—1,||. If T is a bounded continuous transformation for which the 
ordinary Jacobian exists almost everywhere in D, and for which K(é, T, D) 
is summable, then D(u, T)2 | J(u, T) | almost everywhere in D(’). 
9. Combining results stated in II, 4, 8, one obtains the 


1. Lemma. Let T:[E(u), D] be a bounded continuous transformation for 
which K(~, T, D) is summable. Then F(u, T) exists almost everywhere in D, 
ts summable on D, and 


for every domain D in D. A necessary and sufficient condition that all the signs 
of equality hold here for D=® is that T be in the class Kz. When T is in Kz, all 
the signs of equality also hold for every domain D in D. 


2. Lemma. Let T:[&(u), D] be a bounded continuous transformation for 
which K(é, T, D) is summable, and for which the ordinary Jacobian J(u, T) 
exists almost everywhere in D. Then J(u, T) is summable on D, and 


for every domain D in D. A necessary and sufficient condition that all the signs 
of equality hold here for D=®D is that T be in the class K3, When T is in Ks, 
all the signs of equality also hold for every domain D in D. 


10. Lema. If T: [&(u), D] be any bounded continuous transformation, then 
the set E(T, D) (see 11, 7) is a product of open sets, hence a Borel set. 


Proof. Let ” be any positive integer. Denote by E, the set of points uo in 
® for each of which there exists a Jordan region ® in D satisfying the 
following conditions: uo lies in the interior of R; T(R) lies in the open disc 
<n-!; T, R) Evidently each is an open set. One 
easily verifies that JZ. =E( T, D), so the lemma is established. 

Set e(T, D) = E(T, D) —E(T, D)—that is, e(T, D) is the set of those points 
uo belonging to some nondegenerate essential maximal model continuum for 
£(uo) under T in D. Rado and Reichelderfer have shown that €(T, D) is also 
a product of open sets, hence a Borel set. Thus e(7, D) is a Borel set. By a gen- 
eral theorem (see Kuratowski [1, p. 249]) it follows that for every choice of 
an open square s®° in the u-plane, the sets T(s*-e(7, D)), T(s*- E(T, D)) are 


(7) The last result is established in §5.6 of the R* paper. 


262 P. V. REICHELDERFER * [March 


both measurable. Thus both the sets e(T, D), E(T, D) may serve as base sets 
(see II, 3), and the general theory in II, 3, 4 is applicable. 


11. LemMa. A necessary and sufficient condition that the set T(e(T, D)) be 
of measure zero is that T be BV e(T, D). Whenever T is BV e(T, D)), it is also 
AC e(T, D). 


Proof. Observe that (see II, 1) 
+o for £in T(e(T,D)); 
0 otherwise. 


T, (T,D)) = { 


Thus a necessary and sufficient condition that N(é, T, e(T, D)) be summable 
is that | T(e(T, D))| =0; in view of the facts in II, 4, this establishes the first 
part of the lemma. If T is BV e(T, D)), it follows at once that 


o< D(u, T, (7, D))du 7, 7, = 0, 


so that the sign of equality holds here, and T is AC e(T, D). 
From this lemma and the lemma in II, 3 follows the 


Coro.iary. Let T be any bounded continuous transformation which is 
BV e(T, D). A mecessary and sufficient condition that T be BV E(T, D) is 
that T be BV E(T, D). A necessary and sufficient condition that T be AC E(T, D) 
is that T be AC E(T, D). 


If T is BV e(T, D) it is evident that (see II, 3, 7, 10) for every closed ori- 
ented square s whose interior is in D, 


G(s, T, E(T,D)) = G(s, T, E(T,D)). 


Thus whenever T is BV e(T, D), it follows that D(u, T, E(T, D)) =D(u, T, 
€ (T, D))=D(u, T) almost everywhere in D (see II, 4, 8). 

12. Let T: [E(u), D] be a bounded continuous transformation. From the 
definitions in II, 1, 2, 6, 7, it is clear that 7 


1. N(é, T, E(T,D)) 2 K(é, T,D) = NE, T, E(T,D)), 


and the sign of inequality holds between any two of these three functions if 
and only if & lies in the set T(e(T, D)) and one of the functions involved is 
finite. Since K(£, T, D) is measurable (see II, 2, 1), it follows that a necessary 
condition that K(£, 7, D) be summable is that T be BV E(T, D), while a suffi- 
cient condition that K(£, 7, D) be summable is that T be BV E(T, D); in this 
latter case, the signs of equality in relation 1 hold almost everywhere. Now 
suppose that T is BV E(T, D); then since N(é, T, E(T, D)) is finite almost 
everywhere, one concludes that a necessary and sufficient condition that 
K(é, T, D) = N(E, T, E(T, D)) almost everywhere is that T be BV E(T, D). 


1943] PARAMETRIC REPRESENTATION OF CONTINUOUS SURFACES 263 


In view of the definition of the class K: (see II, 8), the corollary in the pre- 
ceding section, and these remarks, one obtains the 


LemMA. A necessary and sufficient condition that a bounded continuous 
transformation T:[£(u), D] be in the class K, is that T be AC E(T, D). 


13. The lemma in the preceding section gives a simple characterization of 
transformations in the class K; and makes available all the results established 
by Rado and Reichelderfer for this class whenever T is AC E(T, D). For 
example, they have the 


CLOSURE THEOREM. Let there be given bounded domains D and D,, én the 
u-plane and bounded continuous transformations T:[€(u), D] and T,: [En(u), 
D,], n=1, 2, - ++, with the following properties: (i) the domains D, fill up D 
from the interior (see 4); (ii) the generalized Jacobian F(u, T) exists almost 
everywhere in D and is summable on D (see I], 8); (iii) T, ts in Ke for n=1, 
2,- ++ ; (iv) for every closed set F contained in D, it is true that 


lim p(T», T; F) = 0, lim f | F(e, Tn) — F(u, T)| du = 0. 


Then T is in Kz. 


Using the preceding results, this theorem may be restated and improved 
as follows. 


14. MODIFIED CLOSURE THEOREM. Let there be given bounded domains D 
and D,, in the u-plane and bounded continuous transformations T: [(u), D] 
and T,:[&.(u), Da], m=1, 2,---+, with the following properties: (i) the 
domains D, fill up D from the interior; (ii) the generalized Jacobian F(u, T) 
exists almost everywhere in D and is summable on D; (iii) T, is AC E(Tn, Da) 
forn=1,2, - + + ; (iv’) there exists a sequence of Jordan regions Ry, in D whose 
interiors R°, fill up D from the interior, and such that 


lim T; Rm) = 0, lim F(u, Tn) | du = F(u, T)| du, 


m=1 2. 
Then T is AC E(T, D). 

Inspection of the proof of Rado and Reichelderfer for their closure theorem 
reveals that property (iv’), which is a consequence of (iv), is all that is needed 
for that proof. 

CoROLLARY. Condition (iv’) of the above theorem may be replaced by the fol- 
lowing condition: (iv’’) for every closed oriented square s contained in D, it is 
true that 


P. V. REICHELDERFER _ [March 


lim p(T», T;3) = 0, lim ff T,)| du = f | du 


Proof. Let R denote any Jordan region in D which may be expressed as a 
sum of a finite number of nonoverlapping closed oriented squares s. From 
(iv’’) it follows that 


lim p(T,, T; R) = 0, tim f | T)|du= f | T) | du. 


By a known theorem in topology (see Kerékj4rté [1]), there exists a sequence 
of regions R, whose interiors fill up D from the interior. Thus condition (iv’’) 
implies (iv’), and the corollary is established. 

15. In view of the lemma in II, 12, the class Ks defined in II, 8 may be 
characterized as the class of all transformations T:[£(u), D] which are 
AC E(T, D) and for which the ordinary Jacobian J(u, T) exists almost every- 
where in D. For the class K; another closure theorem is given by Rado and 
Reichelderfer, the proof of which is based upon the closure theorem for the 
class Kz stated in II, 13. By using the modified closure theorem just given, 
one may parallel their proof to establish the 


MODIFIED CLOSURE THEOREM. Let there be given bounded domains D and 
D, and bounded continuous transformations T:[§(u), D] and T,: [En(u), Da], 
n=1,2,--+.+, with the following properties: (i) the domains D, fill up D from 
the interior; (ii) the ordinary Jacobian J(u, T) exists almost everywhere in D 
and is summable on D; (iii) T, is AC E(Tn, Dn) and the ordinary Jacobians 
J(u, T,) exist almost everywhere in D, for n=1, 2, - + + ; (iv’) there exists a 
sequence of Jordan regions Ry, in D whose interiors R°, fill up D from the interior, 
and such that 


lim p(T., 7; Rn) = 0, — lim | J(u, du = | J(u, T)| du, 
Rn Rm 


m= 1, meee 
Then T is AC E(T, D). 
COROLLARY. Condition (iv’) of the above theorem may be replaced by the 
following condition: (iv’’) for every closed oriented square s contained in D, it is 
true that 


lim T; s) = 0, tim f | T,)|du= f | T) | du. 


If in these results, condition (ii) be weakened by dropping the requirement 
that the ordinary Jacobians J(u, T,) exist almost everywhere in D, for 
n=1,2, +++, andif these Jacobians be replaced by the generalized Jacobians 
F(u, T,) for n=1, 2,++-+, then the conclusions remain the same. 


264 


1943] PARAMETRIC REPRESENTATION OF CONTINUOUS SURFACES 265 


16. The results to be established in the following sections are necessary 
for a comparison of certain notions in this paper with those in the literature. 
In the theory of bounded continuous transformations just sketched, the 
range of definition has been a bounded domain ® (see I, 1). In applications, 
one may have a continuous transformation given by [&(u), 6], where is 
a simple Jordan region (see 4) and &() is a pair of functions defined and con- 
tinuous on the closed set 8. Evidently the transformation given by [£(u), 8°] 
is then bounded and continuous. From work of Rado and Reichelderfer (see 
II, 2, 6), it follows that the transformations [£(u), 8] and [&(u), 8°] have 
the same essential multiplicity functions and the same essential maximal 
model continua. Since the essential multiplicity and the essential maximal 
model continua play the basic role in this paper—that is, since the transfor- 
mations behave essentially alike—there will be no confusion if one designates 
either of them by T. In the sequel this is done, but it is to be understood that 
whenever preceding theory is applied, T is to be interpreted as the transforma- 
tion [£(u), B°]. 


17. Lema. Let T: [€(u), D] be a bounded continuous transformation which 
is BV 8B, where B is an arbitrary base set (see II, 3). Given a positive number e«, 
it is true that the number of mutually exclusive sets E, each of which is either an 
open set or a closed set relative to D, for which the measure of T(E-B) exceeds 
ts finite. 


Proof. Let E:, ---, E, be any finite system of mutually exclusive sets 
having the required properties. Since T is BV 8, it follows that N(é, T, D- 8B), 
N(é, T, E;- B) are measurable and summable (see II, 4). The lemma follows 
from the inequalities 


f NG, T,D-B)dt we, 7, AD | > m. 


From this lemma comes the 


Coro.iary. Let T:[€(u), D] be a bounded continuous transformation which 
ts BV B. Then the number of lines lin any family of paraliel lines in the u-plane 
for which T(l-B) has positive measure is at most enumerable. 


18. Let T: [(u), 8] where B is a simple Jordan region in the u-plane, be 
a continuous transformation. Let § be any other simple Jordan region in the 
u-plane, and consider any topological map of § onto % given by [a(u), B]. 
Denote by 7 the continuous transformation [£(u), 8] where &(u) =£(a(u)) 
for u in §. Since the sets E(T, 8°) and E(T, 8°) are in biunique correspond- 
ence under the map, it is true that N(é, T, E(T, B°))=N(é, T, E(T, B°)). 
This implies (see II, 4) the 


Lemma. If T is BV E(T, 8°), then T is BV E(T, 8°). 


266 P. V. REICHELDERFER — [March 


19. Now let To[€(u), 8], where B is a simple Jordan region in the u-plane, 
be a continuous transformation for which K(£, T, 8°) is summable (see II, 2). 
If S(%) be any finite system (see 4), denote by C the set of points in 8° but 
not in the interior of any B belonging to S(%). Then (see II, 6), wherever 
K(é, 7, B°) is finite—hence almost everywhere, 


= >> Kéé, 7, B®) for B in S(®) if is not in T(C- E(T, B°)); 
> > K¢éé, B) for B in S(®) if is in T(C- E(T, B°)). 


1. BG, 7, 


Thus 
2. f T, Bd = f T, for Bin S(®), 


and a necessary and sufficient condition that the sign of equality hold here 
is that the sign of equality hold almost everywhere in relation 1 —that is, 
that the set 7(C-E(T, B°)) be of measure zero. For brevity, any finite system 
S(%) for which the sign of equality holds in relation 2 is termed a maximal 
system for T. 


20. Lemma. Given a continuous transformation T:[t(u), 8]. A necessary 
and sufficient condition that, for any positive number 6 there exist maximal sys- 
tems S(®) for T such that ||S(%)|| is less than 6 is that T be BV E(T, 8°). 


Proof. First, assume that there exists a sequence of finite systems S,(%) 
such that ||.S,(%)|| converges to zero and 


f K(E, T, B)dt = f K(t, T, B)dt for BinS,(@), =1,2,---. 


Denote C, the set of points in 8° none of which is in the interior of a region B 
belonging to S,(%) for n=1, 2, - ++ ;set ['=),C,. From the remark in II, 
19, it follows that the set 7(['-E(T, B°)) is of measure zero. On the other 
hand, the set T(e(7, B°)) (see II, 10) is clearly a subset of T(T-E(T, B°)), 
hence also of measure zero. By the lemma in II, 11, it is true that T is 
BV e(T, 8°). Since K(~, T, B°) is summable, it follows from II, 12 that T 
is BV E(T, B°). From the lemma in II, 3, it is clear that T is BV E(T, 8°). 
This establishes the necessity of the condition. Next, assume that T is 
BV E(T, %°); then K(é, T, B°) is summable (see II, 12). First, consider the 
special case when the simple Jordan region & is an interval 3 (see 4). Given 
a positive number 4, there exists, according to the corollary in II, 17, an 
interval subdivision S($) such that ||.5(3)|| is less than 6 and T(/-E(T, 3°) 
is of measure zero for every line / forming the subdivision. Denote by C 
the set of points in §° belonging to the lines of subdivision forming S($). 
Clearly T(C-E(T, $°)) is of measure zero, so that S(%) is a maximal sys- 
tem for T with ||.S(9)|| less than 6. Now consider the general case when 


1943] PARAMETRIC REPRESENTATION OF CONTINUOUS SURFACES 267 


% is any simple Jordan region. Let $3 be any interval in the u-plane; then 
there exists a topological map of $ onto B given by [a#(u), 3]. Denote by 
T the transformation [&(u), $] where &(u)=£(a(u)) for u in 3. Since T is 
BV E(T, 3°) (see II, 18), it follows by the result just established that there 
exist interval subdivisions S$($) which are maximal with respect to T for 
which ||5(3)|| is arbitrarily small. A maximal system 5(3) for T corresponds 
under the map to a system S(%) which is easily seen to be maximal for T. 
In view of the uniform continuity of the topological map, ||.S(%)|| will be less 
than 6 provided ||5()|| is chosen sufficiently small. This establishes the 
lemma. 
21. The methods of proof for the preceding lemma yield the 


Lema. If T,: [&,(u), 8], 2, + , be sequence of continuous trans- 
formations each defined on the simple Jordan region 8, and each BV E(T>,, 8), 
then for every positive number & there exists a subdivision S(B) having ||5(%)|| 
less than 5, such that S(%) is maximal for each of the T, forn=1,2,---.If 
% is an interval then S(B) may be chosen to be an interval subdivision. 


22. Again, let T:[£(u), 8] be a continuous transformation. For any 
simple Jordan region B in %, denote by c($, T, B®) the characteristic func- 
tion of the set of points — where K(£, T, B®) is positive (see II, 2). Then 
c(é, T, B®) is summable, and 


f T, =| T(E(T, B))| Bin. 


Consider any sequence of finite systems S,(%) for which || S.€%)|| converges 
to zero. Clearly 


Ké¢é, T, 8°); 
—new 9 if T, B®) is finite. 


> _ [K(é, 7, B%) — e(é, 7, 


Bin 8,(8) 


Thus if K(é, T, 8°) is summable, it follows by a theorem of Lebesgue that 
one may integrate this sequence termwise to obtain 


i [K(é, T, c(é, T, B®) |dé = 0. 
neo «Bin 8,(%) 
Combining this result with that in II, 21, one concludes the 


Lemma. Let T: [&(u), B] be a continuous transformation which is BV E(T, 
B°). Then for any sequence of maximal systems S,(B) for T such that ||S,(B)|| 
converges to zero, it is true that 


f T, B°)dé = lim = f T, B)dé. 


no B in 8,(B) 


P. V. REICHELDERFER 


CHAPTER III 
ON FLAT CONTINUOUS SURFACES 


1. Suppose that Sis a flat continuous surface (see I, 3) lying in the &-plane. 
Every representation [¢(u), 8] for S determines a bounded continuous trans- 
formation T from the simple Jordan region % to the &-plane (see II, 16). For 
brevity, [£(), %] is termed BV E, AC E, and so on, whenever T is BV E(T, 
%°), AC E(T, B°), and so on. 


Lemma. The essential multiplicity K(£, T, B°) is independent of the choice 
of the representation [€(u), B] for S. 


Proof. Let [£:(u), [&(u), be any two representations for S; 
denote the corresponding transformations by 7;, T2. From I, 1 it follows that 
there exists a sequence of topological maps of 8; onto Bz given by [#,(u), B:], 
n=3, 4,-+-, such that —£(a#,(u))|| <<m— for u in B. Let T, denote 
the transformation given by [£(#,(u)), 81] for n=3, 4,---. From the 
definition of the essential multiplicity (see II, 2, 6) and the nature of a topo- 
logical map, it is clear that K(£, T2, B8)=K(é, Tn, BY) for n=3, 4,---. 
Since (see II, 1) lim p(T,, T1; 81) =0, it follows from II, 2, 2 that lim inf K(é, 
Tn, B°) =>K(E, T:, BY). Combining these relations, one obtains K(£, T2, $$) 
=K(é, 7:1, BY). By symmetry, the opposite inequality follows; thus the 
lemma is established. 

2. In view of the lemma above, one may define an essential multiplicity 
for a flat continuous surface S lying in the &-plane by the relation K(é, S) 
=K(é, T, B°), where T is the transformation associated with any representa- 
tion [£(u), B] for S. 


Lema. The essential multiplicity K(&, S) is a lower semi-continuous func- 
tional in each of its arguments — and S. 


Proof. The fact that K(é, S) is a lower semi-continuous function of & fol- 
lows from II, 2, 1. Next, suppose that the flat surfaces S, in the §-plane con- 
verge to the surface So. From I, 1 it follows that there exist representations 
[£.(u), B] for S,, form=0, 1,2, - ,such that the corresponding transforma- 
tions T, satisfy lim p(T,, To; B) =0. So from II, 2, 2 follows 


lim inf K(é, S,) = lim inf K(é, T,, B°) 2 K(é, To, 8°) = K(é, So). 


This establishes the lemma. 
3. For any flat surface S in the &-plane, define (see II, 2) 


eV(S) = fixe if K(é, S) is summable; 


+ otherwise. 


The quantity eV(S) is termed the essential variation for the surface S. If 


268 [March 
| 


1943] _PARAMETRIC REPRESENTATION OF CONTINUOUS SURFACES 269 


eV(S) is finite, then S is said to be a surface of bounded essential variation— 
briefly, BEV. From the remarks in II, 12 follow the 


1. Corotiary. A sufficient condition that a flat surface be BEV is that it 
possess a representation which is BV E. A necessary condition that a flat sur- 
face be BEV is that each of its representations be BV E. Thus if one representa- 
tion for the surface is BV E, then all representations are BV €. 


2. COROLLARY. The essential variation eV(S) is a lower semi-continuous 
functional of S. 


4. From the lemmas in II, 9, 12 one obtains the 


1. Lemma. If Sis a flat surface which is BEV, and if [(u), B] is any repre- 
sentation for S, then 


eV(S) = f | #(u, T) | du, 
where T is the corresponding transformation. A necessary and sufficient condi- 


tion that the sign of equality hold here is that [€(u), ®] be AC E. 


2. Lemma. If S is a flat surface which is BEV, and if [&(u), B] is any 
representation for S for which the ordinary Jacobian exists almost everywhere in 
B°, then 


where T is the corresponding transformation. A necessary and sufficient condition 
that the sign of equality hold here is that [t(u), 8] be AC E. 


5. For the purpose of comparing results with those in the literature for 
surfaces given in non-parametric representation, it is necessary to recall the 
concepts for bounded variation and absolute continuity used by Tonelli 
(see Tonelli [1]). Let f(u) =f(u', u*) be a real, single-valued function defined 
and continuous on the interval $= [a, 8] = [a!, B'; a?, 6?) (see 4). For fixed 
u? in [a*, denote by Via(f; u?) the total variation of f(u!, u*) as a function 
of on [a', The function Va(f; is a lower semi-continuous function 
of u? on [a*, 6], hence is measurable. Define V.s(f; u") by interchanging the 
roles of u! and u?. If both Via(f; u?), Vus(f; u') are summable on their respec- 
tive intervals of definition, then f(z) is said to be of bounded variation in the 
sense of Tonelli on §—briefly, BV T on &%—and the total u!- and u?-varia- 
tions of f(u) on $ are defined by 


Vul(f) = Vial f; u?)du?, Via(f) = Via(f; 


al 


270 P. V. REICHELDERFER ° [March 


If f(u) is BV T on &, then it follows that the partial derivatives f.:(u), f.a(u) 
exist almost everywhere in $¥ ,are summable on $, and 


1. Vulf) = | fus(u)| du, Vulf)= | | dus. 


6. Assume that f(u) is BV T on $. If moreover, for almost every u? in 
[a?, 6?] it is true that f(u', u?) is an absolutely continuous function of u! on 
[a!, B'], and if a similar relation holds with the roles of u! and x? reversed, 
then f(u) is said to be absolutely continuous in the sense of Tonelli on ¥— 
briefly, AC T on 3. A necessary and sufficient condition that a function f(u) 
which is BV T on & be AC T on $& is that the sign of equality hold in both of 
the relations III, 5, 1. 

7. Let f(u) be a real, single-valued function defined and continuous on the 
interval ¥=[a, 8]. Consider the continuous transformation defined by 


a7": f= u*, u*), (u', u*) in 
Notice that for a fixed u?=y? in [a?, 6*], !T gives a linear transformation from 
the closed linear interval u?=-+y?, a! <u} <f' to a bounded portion of the line 
t!=~*. For this transformation it is known (see Rado [4]) that a necessary 
and sufficient condition that f(u!, y?) be of bounded variation on [a’, 6'] is 
that NV((y?, &), 'T, $°) be summable as a function of & (see 4; II, 1), and that 
if f(u, y?) is of bounded variation on [a', 6'] then (see III, 5) 


1. f = Vals), 


Thus it follows that if Vis(f; u*) is summable on [a?, 6?], then [N((é, #), 
1T, §°)d# is a summable function of £ on [a?, 6?]. Now (see R? [1], chap. 
III ]) N(é, !7, is measurable in the £-plane. So if Vis(f; u?) is summable on 
[a*, 87], it follows from the theorem of Fubini that N(é, !T, $°) is summable, 
hence (see II, 4) 'T is BV &°, and (see III, 5) 


f *T, = f ag! f &), 17, 


= Vial f; u*)du? = Vulf). 
a? 
Conversely, suppose that 'T is BV $°; then (see II, 4) it is true that N(£,'!T, 
§°) is summable, so that by the theorem of Fubini, the function N((7’, &), 
1T, °) is summable for almost every choice of £'=~y* in the interval [a?, 6], 


and 


3. f = fae +7, 9928. 


1943] PARAMETRIC REPRESENTATION OF CONTINUOUS SURFACES 271 


By the result cited above, it follows that f(u', y*) is of bounded variation on 
[a!, B'] and relation 1 holds for almost every choice of u?=~? in [a?, 6?]. In 
view of relation 3, it follows that V.a(f; u*) is a summable function of u*? on 
[a?, 62]. A similar reasoning applies to the continuous transformation defined 
by 

These results are summarized in the 

LemMA. Let f(u) =f(u', u*) be a real, single-valued function defined and con- 

tinuous on the interval $. Consider the two transformations 

IT; = 43, = f(u', u?), (u!, «*) in 3; 

A necessary and sufficient condition that f(u) be BV T on & is that both 'T and 
*T be BV &°. If f(u) is BV T on &, then 


f NE, = Vualf) = | | dee; 


f M27, 994 = fa(u) | du. 


A necessary and sufficient condition that the sign of equality hold in both these 
relations is that f(u) be AC T on §&. 


8. Again let f(u) be a real, single-valued function defined and continuous 
on the interval $=[a, 8]. Retain the notation of the preceding section. For 
an interval J = [y', 5!; contained in define 


8? 


It will now be shown that for any interval system S(%) (see 4), 
KU, *7, a8 


provided that K(é', 7, $ ) is summable. Evidently (J) is the measure of the 
set of points §=(é', &) which satisfy the inequalities 


(f(8!, — (f(y, — 8) <0, < 


Consider any point £=(&, &) which satisfies these inequalities. Its inverse 
1T7-1(£5) evidently is contained on the line u?=£}. Since f(u, &) is continu- 
ous on [7', 6'] and has opposite signs at the points u!=~', u?=6', it follows 
that £) has a model in the interior of J, and that u(£o, 'T, J) = +1. From II, 6 


272 P. V. REICHELDERFER (March 


it follows that 'T, I®)21. Thus '¥y(J) /K(é, T, I*)dé. From II, 19, 2 
it follows that 


ws) s Ke s KE, 7, 
T in 
Thus relation 1 is established. Now consider the sequence of subdivisions 
of into intervals [y}_,, 7}; a, 6*], where yj =a!+ (6!'—a!)j/m, j =0, 
- +, m. Evidently 


1 


and (see III, 5) 
lim | f(r — u)| = Vulf; u). 


From the lemma of Fatou it follows that Vu(f; u?) is summable on [a2, 67] 
whenever K(é, 3°) is summable, and 


s f K¢(é, iT, 3°)dé. 


Since K(é, !T, $°) < N(é, 1T, $°), it is clear from III, 7 that a necessary and 
sufficient condition that Vus(f; u2) be summable on is that K(é, 3°) 
be summable. If K(&, 'T, $°) is summable, then 


Valf) = f K(é, 17, = | fus(u) | du. 


Similar statements are valid for the transformation ?7.. Combining these facts 
with those in III, 3, 4, 7, one obtains the 


Lemma. Let f(u) =f(u', u*) be a real, single-valued function defined and con- 
tinuous on the interval 3. Consider the flat surfaces having the representations 


1S: [(u?, u*)), 3]; *S: [(u', S(u', u*)), 3]. 
A necessary and sufficient condition that f(u) be BV T on $ ts that both'S and *S 
be BEV. If f(u) is BV T on &, then 
eV('S) = Vul(f) = f | fur(u)| du;  eV(2S) = Val(f) = f | fua(u) | due. 
3 


A necessary and sufficient condition that the sign of equality hold in both these 
relations is that f(u) be AC T on 3. A necessary and sufficient condition that 
f(u) be AC T on §& ts that both of the representations for 'S and *S be AC E. 


1943] PARAMETRIC REPRESENTATION OF CONTINUOUS SURFACES 273 


CHAPTER IV 


ON THE ESSENTIAL AREA 


1. Let S be a continuous surface in x-space (see I, 1), and let [x(u), 8] 
be any representation for S. With S there is associated the three projection 
surfaces ‘S upon the coordinate planes ‘x having representations [‘x(u), 8], 
which determine bounded continuous transformations ‘T for i=1, 2, 3 (see 
4;1,4). In this chapter, the theory of the preceding chapters is used when the 
plane é coincides in turn with the coordinate planes ‘x, i=1, 2, 3. The follow- 
ing triple notation is useful (see III, 2) 


K(z, S) = (K(?2, 1S), K(?x, *S), K(*x, 8S)); 
F(u, [x, B]) = (F(u, 17), F(u, 27), F(u, *T)) for u in B®; 
(uy [x, B]) = (T(u, *T), J(u, *7)) for 
The triple [x(u), 8] is said to be BV E, AC E, and so on, when each of the 
associated transformations ‘T is BV E(‘T, 8°), AC E(*‘T, B°), and so on, for 
4=1, 2, 3 (cf. III, 1). 

2. Given a continuous surface S for which K(x, S) is summable, that is, 
for which each of the projection surfaces 4S, 2S, *S is BEV (see III, 3). If 
[x(u), 8] be any representation for S, define, for any simple Jordan region B 
in B 

= f(x, ‘7, i= 1,2, 3; 


= ('@(B), *6(B), *o(B)), = ||9(B)|| for Bin 
It follows from II, 19 that, for any finite system S(%) of nonoverlapping 
simple Jordan regions in % (see 4), it is true that ‘¢(S(%))<‘¢(%) for 
4=1, 2, 3. Thus one has (see III 1-3) 
1. eV(‘S) = = B]) for i = 1, 2, 3. 


From a remark in 5 follows the fact that U(%; [®, %]) is finite; moreover, 
one has the 


Lemma. The quantity U(®;[&, B]) is independent of the choice of the repre- 
sentation for S. 


Proof. Let [x:(u), 81], [x2(u), B2] be any two representations for S. From 
I, 1 it follows that there exists a sequence of topological maps of $; onto B: 
given by [#,(u), B:] for »=3, 4, - - - such that —x2(i,(u))|| for 
u in %,. Denote by ‘7, the transformations given by the triples [*x2(#,()), 
for n=3,4, +--+ ;4=1, 2, 3. Put 


= f K(‘x, *T,, B)d‘z, t= = 1,2,3,---; 


— 


274 P: V. REICHELDERFER {March 


on(B) = (*n(B), *n(B), *bn(B)), ®,(B) = ||¢.(B)|| for Bin 


To any finite system S(%) (see 4) there corresponds under the map [#,(u), 
%,] a finite system S,(%:2). Clearly (see III, 1) 


U(Bs, [2, B2]) = = form = 3,4,--:. 


Since lim p(*7,, ‘T1; B1) = 0 fori =1, 2, 3, it follows fromII,2,2 and the lemma 
of Fatou that lim inf ®,(.S(%:)) = ,(S(%:)). From the preceding relations 
and the definition of the U-function (see 4), it follows that U(®2; [&2, B2]) 
= U(B:; [#1, B1]). The opposite inequality follows by symmetry, and the 
lemma is established. 

3. Let S be a continuous surface in x-space. In view of the preceding 
lemma, one may define the essential area eA(S) for the surface as follows. If 
[x(u), B] be any representation for S, set 
U(®; [®,B]) if K(x, S) is summable; 

+ otherwise. 


as) = { 


Clearly eA(S) is independent of the choice of the representation for S, al- 
though it is not clear as to whether it is also independent of the choice of a 
coordinate system in x-space. If S is a flat surface in a coordinate plane (see 
I, 3), then eA(S) =eV(S) (see III, 2; IV, 2, 1)(®. 

4. From 5; IV,-2, 1 follows the 


THEOREM. A necessary and sufficient condition that the essential area eA(S) 
of a surface S be finite is that each of the projection surfaces 'S,*S,*S be of bounded 
essential variation. Between the essential area and the essential variations of the 
projection surfaces, the following relation exists 

3 


eV(‘S) S eA(S) eV(‘S). 
Notice that this theorem is an analogue for continuous surfaces of the 
result for continuous curves cited in 1, 1. 


5. Lemma. The essential area eA(S) is a lower semi-continuous functional of 
S. 


Proof. Suppose the sequence of continuous surfaces S, converges to a 
surface So; from I, 1 it follows that there exist representations [x,(u), 8] for 
x=0,1, 2, - - such that x,(u) converges on uniformly to xo(u). Adopt the 
notation of IV, 2, using a subscript m to distinguish the functions belonging to 


(®) Since eV(‘S) =eA(*‘S) for i=1, 2, 3, the notion of essential variation might very well 
be discarded. However, it has not been the custom to speak of the length of a one-dimensional 
curve, but rather to speak of the tetal variation of a function representing that curve. To pre- 
serve this parallel between the theory of curves and the theory of surfaces, the concept of essen- 
tial variation has been introduced here. 


1943] PARAMETRIC REPRESENTATION OF CONTINUOUS SURFACES 275 


S,. From II, 2, 2 and the lemma of Fatou, it follows that lim inf ‘¢,(B) 
= ‘$o(B) for B in S$, 1=1, 2, 3. From the lemma in 5 and the definition of the 
essential area follows lim inf eA(S,)2eA(So), and the lemma is proved. 


6. THEOREM. If the essential area for a surface S is finite, and if [x(u), B] 
be any representation for S, then the triple F(u, [x, B]) of generalized Jacobians 
exists almost everywhere in 8°, is summable on B°, and 


1. cA(S) ff [2 


A sufficient condition that the sign of equality hold here ts that the triple [x(u), B| 
be AC E; a necessary condition that the sign of equality hold ts that the triple be 
AC €. 


Proof. Since eA (S) is finite, it is true that each ‘S is BEV (see IV, 4), hence 
K(‘x, *T, 8°) is summable for i =1, 2, 3 (see III, 2-3). Thus(seeIV, 2; II, 9, 1) 
each F(u, ‘T) exists almost everywhere in 8°, is summable on %°, and 


2. ‘¢(B) = fi. | F(u, *T) | du for Bin&, ¢= 1, 2, 3. 
Define 


¥(B) = ((B), ¥(B), ¥(B)),  ¥(B) =||¥(B)|| for Bin. 


If S(%) be any finite system, it follows by a known inequality (see Hardy, 
Littlewood, Polya [1, chap. VI]) that 


On the other hand, let R be a simple Jordan region inS which admits of an 
interval subdivision. Then it is true (see R? [2, chap. II, §10]) that there 
exists a sequence of interval subdivisions S,(R) such that 


lim ¥(S4(R)) = 


U(®; [¥, B) = 


Thus 
Thus 


276 P. V. REICHELDERFER [March 


Since the open domain %* may be filled up from the interior (see 4) by a se- 
quence of simple regions of type R (see Kerékj4rté [1]), one concludes from 
relations 4 and 5 that 


Relations 2 and 3 imply 

7. U(B; B]) = [¥, 

In view of IV, 3, the first-statement in the theorem is established. Now (see 
II, 9, 12) if [x(u), %] is AC E, then the sign of equality holds in 2, hence in 7. 
Thus a sufficient condition that the sign of equality hold in 1 is that [x(u), 8] 
be AC E. Next, suppose the sign of equality holds in relation 1. Let J be any 
interval in 8; extend the lines forming the boundary of I indefinitely. The 
simple Jordan region % is thus divided into a possibly enumerable number of 
simple Jordan regions J =Bo, Bi, Bz, - - - . For any simple Jordan region B in 
%, the reasoning leading to relation 6 gives 


Since a bounded portion of a straight line is rectifiable, it follows that 
U(®; [¥, B]) = U(B,; [¥, B]). 

From this relation and relations 2, 3, one finds 


But since the sign of equality holds in 1 by hypothesis, it follows that the sign 
of equality must hold at every step in the preceding inequalities. In particular, 
then 


[8,8] = UCT; [¥, 8) = f 


If s be any closed oriented square in %, then obviously (see II, 3-8; IV, 1-3) 


Gls, ‘T, ECT, < f K(‘x, ‘7, s)déx < U(s; B]) 


= f llr, 


Thus each ‘T is AC E(‘7, B°) for i=1, 2, 3—that is, [x(u), B] is AC E. This 
completes the proof. 


1943] PARAMETRIC REPRESENTATION OF CONTINUOUS SURFACES 277 


7. CoroLvary. If a surface S has a representation |x(u), 8] which is BV E, 
then the essential area eA(S) is finite, the triple F(u, [x, B]) of generalized 
Jacobians exists almost everywhere in 8°, is summable on B°, and 


eA(S) = [x, B])||du. 


A necessary and sufficient condition that the sign of equality hold here is that 
[x(u), B] be AC E. 


A proof follows from results in II,-11; III, 3, 1; IV, 4, 6. This is an analogue 
to the theorems in 1, 2, 3 for continuous curves; all representations for con- 
tinuous curves automatically satisfy the analogue of BV E in one dimension 
whenever the length of the curves is finite. 


8. CoROLLARY. Let S,, n=0, 1, 2,- ++, be a sequence of continuous sur- 
faces satisfying the following conditions: the surfaces S, converge to So (see I, 1); 
each of the surfaces has a finite essential area eA(S,) for n=0, 1, 2, - - - (see 
IV, 3, 4); the surface So has a representation |[x(u), 8] for which the essential 
areas (Sn) converge to F(u, [x, B])||du. Then the representation [x(u), B] 
ts AC ©, 


Proof. From IV, 5, 6 and the assumptions, one obtains 
[x, B])||du = lim eA(S,) = eA(So) = [x, B])||du. 


Thus the sign of equality holds throughout, and the conclusion now follows 
from the last part of the theorem in IV, 6. 

9. By the principle stated in I, 8, the theorem in IV, 6 may be given the 
following variant form. 


THEOREM. A necessary condition that a representation |[x(u), 8] for a surface 
S be absolutely continuous (eA, F), where eA(S) is the essential area of S and 
F(u, [x, B]) is the triple of generalized Jacobians, is that [x(u), B] be AC E. 
A sufficient condition that this representation for S be absolutely continuous 
(eA, F) is that it be AC E; then each of the corresponding representations for the 
projection surfaces *S is also absolutely continuous (eA, F) for i1=1, 2, 3. 


i0. The results in IV, 6-9 are paralleled by similar theorems involving 
the essential area and the triple of ordinary Jacobians; one need but replace 
II, 9, 1 by II, 9, 2 in making the proofs. 


THEOREM. If the essential area for a surface S is finite, and if [x(u), B] be 
any representation for S for which the triple J(u, [x, 8]) of ordinary Jacobians 
exists almost everywhere in B°, then the triple is summable on 8°, and 


eA(S) 2 [x, B])|| du. 


278 P. V. REICHELDERFER [March 


A sufficient condition that the sign of equality hold here is that the triple [x(u) B] 
be AC E; a necessary condition that the sign of equality hold ts that the triple be 
AC €. 

11. Coro.iary. If a surface S has a representation [x(u), 8] which is 
BV E and for which the triple J(u, [x, 8]) of ordinary Jacobians exists almost 
everywhere in 8°, then the essential area is finite, the triple J(u, [x, B]) és 
summable on 8°, and 


eA(S) 2 [x, B])||du. 


A necessary and sufficient condition that the sign of equality hold here is that 
[x(u), B] be AC E. 


12. CoroLuary. Let S,, n=0, 1, 2, - ++, be a sequence of continuous sur- 
faces satisfying the following conditions: the surfaces S, converge to So; each of 
the surfaces has a finite essential area eA(S,) forn=0, 1,2, - + + ; the surface S. 
has a representation [x(u), 8] for which the triple J(u, [x, B]) of ordinary 
Jacobians exists almost everywhere in B° and the essential areas eA(S) converge 
to feol| J(u, [x, B])||du. Then the representation [x(u), B] is AC E. 


13. Here is a variant form for the theorem in IV, 10. 


THEOREM. A necessary condition that a representation [x(u), B] for a sur- 
face S, for which the triple J(u, [x, B]) of ordinary Jacobians exists almost 
everywhere in 8°, be absolutely continuous (eA, J) is that [x(u), B] be AC E. 
A sufficient condition that this representation for S be absolutely continuous 
(eA, J) is that it be AC E; then each of the corresponding representations for the 
projection surfaces *S is also absolutely continuous (eA, J) for 1=1, 2, 3. 


14. Rado (see Rado [2]) has shown that a representation [x(u), 8] for 
a surface S which satisfies a Lipschitz condition of the form ||2c(2e1) —x(us)|| 
<L||u:—w|| where L is a constant, is absolutely continuous (A, J), where 
A(S) is the Lebesgue area of S (see I, 6) and J(u, [x, B]) is the triple of 
ordinary Jacobians. Now the representation [x(u), 8] is also AC E (see II, 
8). From the theorem in IV, 13 one concludes that the essential area and 
the Lebesgue area of S are equal. In particular, eA(P)=A(P) for every 
polyhedron P (see I, 5). 

Given any continuous surface S, there exists a sequence of polyhedra P, 
such that P, converge to S and the Lebesgue areas A(P,) converge to A(S) 
(see I, 6, 2). Since eA(P) = A(P,), one finds by using the lemma in IV, 5 the 


THEOREM. The essential area of a surface S does not exceed the Lebesgue area 
—that is, eA(S)SA(S). A necessary and sufficient condition that the essential 
area and the Lebesgue area of a surface S be equal is that there exists a sequence 
of polyhedra P,, such that P,, converges to S and eA(P,) converges to eA(S). 


1943] PARAMETRIC REPRESENTATION OF CONTINUOUS SURFACES 279 


This theorem and the theorem in iV, 10 have the 


COROLLARY. A sufficient condition that the Lebesgue area and the essential 
area of a surface S be equal is that S possess a representation [x(u), 8] which is 
absolutely continuous (A, J), where A(S) is the Lebesgue area of S and J(u, 
[x, B]) és the triple of ordinary Jacobians. 


15. In order to compare these results with those of Gedcze and Tonelli 
(cf. 2), and to give a proof for the statement made in I, 7 the following re- 
sult is needed. 


_LEMMA, If a surface S has a representation [x(u), ¥] of non-parametric 
origin 
S: x(u) = x*(u', u?)), u = in &, 
then eA(S) =As(S) =A(S). 
Proof. Define, for any interval I= [y!, 5'; y?, 8] in $(*). 


= (ff, | — 23(51, | du’, 


| 42) — | dul, itl), = 


If ‘7, i=1, 2, 3, be the associated continuous transformations (see IV, 1), 
then the reader will observe that the transformations 'T and *T bear the same 
relation to the function x*(u) as those in III, 7, 8 bear to the function f(x) 
therein considered; the transformation *T is simply the identity mapping of 
$ in the u-plane onto a congruent interval in the *x-plane. Define a triple of 
b-functions [¢, 3] as in IV, 2. The reasoning in III, 8 shows that ¥(J) <¢(J) 
for I in 3. Now Rado has shown that (see Rado [1]) 


A,(S) = Lu.b. ¥(S($)) for interval subdivisions S($). 


From these relations it is clear that 
A,(S) U(3;[ = eA(S). 


In view of the relations established in I, 7 and IV, 14, the lemma is now 
proved. 

The reader will notice that attention is restricted here to surfaces having 
a representation of non-parametric origin where the parameter range is an 
interval. Further considerations would establish this result more generally, 
but these are not necessary since it has been customary in the literature on 
such surfaces to so restrict the range of the independent variables. 

16. The lemmas in III, 7, 8; IV, 4, 10, 11, 15 imply all the results of 


(*) These are the expressions of Gedcze (see Rado [1]). 


280 P. V. REICHELDERFER [March 


Geécze and Tonelli for the non-parametric case which are analogous to those 
for curves stated in 1. For if S be a surface having a representation [x(u), ¥] 
of non-parametric origin 


1. x(u) = u?, x3(u', u*)), u = (u', in the interval 


then eA(S) =A(S) by the lemma in IV, 15. Thus from IV, 4 it follows that a 
necessary and sufficient condition that the Lebesgue area of S be finite is that 
each of the projection surfaces 1S, 7S, *S be BEV. The projection surface *S 
is obviously BEV under any conditions, while it follows from the lemma in 
III, 8 that a necessary and sufficient condition that 1S and 2S be BEV is that 
x*(u) be BV T on 3. Summarizing these facts, one obtains a known theorem 
in the non-parametric case (see Tonelli [1]): a necessary and sufficient condi- 
tion that the Lebesgue area A(S) of a surface S having a representation 1 
of non-parametric origin be finite is that x*(u) be BV T on 3. 

If x*(u) is BV T on Y, it follows that the triple J(u, [x, $]) of ordinary 
Jacobians exists almost everywhere on $— in fact, 


J(u, [x, ¥]) = (- 1) almost everywhere on 


Hence from IV, 10 one concludes that this triple is summable on $ and 


In view of the lemma in III, 7, it is clear that the representation [x(u), $] is 
BV $°, hence BV E£, when x*(u) is BV T on &; from IV, 11 it follows that a. 
necessary and sufficient condition that the sign of equality hold above is that 
[x(u), $] be AC E. The representation [*x(u), 3] for *S is clearly AC E under 
any conditions, while it follows from the lemma in III, 8 that a necessary 
and sufficient condition that the representations ['x(u), 3], [x(u), $] be 
AC E£ is that x*(u) be AC T on 3. Thus other known theorems for the non- 
parametric case are obtained (see Tonelli [1]): if the Lebesgue area A(S) is 
finite, then each of the partial derivatives 


xge(u) 
exists almost everywhere in 3, is summable on $ and 


1/2 


A(S) = + + 1)" du; 


a necessary and sufficient condition that the sign of equality hold here is 
that x*(u) be AC T on &. 

17. The next sections are devoted to a comparison of the essential area 
defined in IV, 3 with the Geécze area as defined by Rado (see Rado [2, part 
II, §1]). Given a continuous surface S, let [x(u), 8] be any representation 
for it. Define (see II, 2, 22), for any simple Jordan region B in %, 


= 


1943] PARAMETRIC REPRESENTATION OF CONTINUOUS SURFACES 281 
= | ‘7, B)| =| B)| = f c(ix, ‘T, i= 1, 2, 3; 


= (*¥(B), *¥(B), ¥(B)), -¥(B) = ||¥(B)|]. 


The quantity U(%; [¥, B]) is independent of the choice of a representation 
for S (see Rado [2]). It is this quantity which Rado terms the Gedcze area of 
S—denote it by G(S). Since &(1, ‘7, B) is the set of points where K(‘x, ‘T, 
B*)>0, it follows at once that (see IV, 2, 3) ¥(B) <¢(B), hence 
for Bin and U(%, [¥, B]) < U(B, B]). Therefore one has the 


Lema. The Gedcze area does not exceed the essential area— that is, G(S) 
seA(S). 


18. Retain the notation of the preceding section. Notice that (see II, 3, 6, 
7) for any closed oriented square s in 8, it is so that 


= H(s) = | *T(s*- B)) | = Gis, 7, E'T,B)), i= 1, 2,3. 


Thus a necessary condition that the Gedcze area G(S) be finite is that each 
of the representations [‘x(u), 8] for the projection surfaces ‘S be BV € for 
4=1, 2, 3. This result implies the 


Lemma. If a surface S possesses a representation |[x(u), 8] which is not 
BV €—+that is, for which at least one of the representations |‘x(u), B] is not 
BV €—thenG(S) =eA(S)=+. 


19. Lemma. If a surface S possesses a representation |[x(u), 8] which is 
BV E, then GS) =eA(S)<+. 


Proof. From III, 3; IV, 4 it follows that eA(S) is finite. In view of the 
lemma in IV, 17 it is sufficient to show that G(S) 2=eA(S). Given a positive 
number e, there exists a finite system S(%) such that ®(.S(%)) >eA(S)—e. 
Since [x(u), 8] is BV E, one concludes from the lemma in II, 20 (see IV, 1) 
that there exists a sequence of subdivisions S,(B) for each B in 8 such that 
||.S.(B)|| is less than m— and each S,(B) is maximal for each of the transforma- 
tions [ ‘x,(u), B] for i=1, 2, 3. From the lemma in II, 22 (see IV, 2, 17), it 
follows that ‘¢(B) =lim *¥(.S,(B)) for 2, 3; B in S(%). Denote by S,(B) 
the finite system consisting of all simple Jordan regions belonging to an S,(B) 
for some B in S(%). From the triangle inequality one finds that 


3 1/2 
Gis) 2 = 6(5(B)) > eA(S) — 


B in S(®) 


Thus GS) >eA(S)—«, and since ¢ is arbitrary, the lemma follows. 

20. Summarizing the lemmas in IV, 18, 19, one concludes that the essen- 
tial area and the Geécze area of a surface are equal if the surface either has 
a representation which is not BV €, or has a representation which is BV E. 


282 P. V. REICHELDERFER [March 


This leaves open the question of whether these areas are equal if all the repre- 
sentations of a surface are BV €, but none is BV E. Indeed, a first question 
might be whether such surfaces exist. A negative answer to this question 
would enable one to close the gap between necessary conditions and sufficient 
conditions in the results in IV, 6, 10. 

21. For applications in the next chapter, the following result will be use- 
ful. Assume that a surface S has a representation which is BV £; in view of 
the lemma in II, 18, one may assume that this representation has the form 
[x(u), %] where $ is an interval. Now the essential area ¢A(S) is finite (see 
IV, 4). It follows from the lemma in II, 20 that there exist interval subdivi- 
sions S,($) such that || Sa(3)|| converges to zero and each S,(3) is maximal 
for each of the transformations [‘x(u), 3] for i=1, 2, 3; for brevity, S,(9) is 
said to be maximal for [x(u), 3]. Given a positive number ¢, let S($) be a 
finite system of nonoverlapping Jordan regions such that ®(S(3)) >eA(S)—e. 
For each B in S($), denote by u(B) an arbitrary point in the interior of B. 
For each integer m and each B in S($), denote by S,(B) the maximal collec- 
tion of those intervals in S,($%) whose point set sum £, is a simple Jordan 
region containing u(B) and lying in the interior of B. It follows (see Kerék- 
jarté [1]) that the @, fill up B from the interior (see 4). Consequently one 
obtains (see II, 2, 2; IV, 2) 


= (B) for i = 1, 2, 3; B in S(9), 
eA(S) = 0(S.(9)) = Do > DL 


B in S(3) B in S(3) 


> eA(S) —«. 
In view of IV, 3, this implies the 


Lemma. Let S be a surface having a representation which is BV E. Then 
there are representations [x(u), 3] for S which are BV E, and for every sequence 
of interval subdivisions S,(%) such that I|S.(3)|| converges to zero and each 
Sn(3) is maximal with respect to [x(u), B], it is true that lim ®(S,(3)) =eA(S). 


CHAPTER V 


APPLICATIONS 


1. Ina study of convergence in area, Rado and Reichelderfer (see R? [2]) 
obtained the following results. Let S,, »=0, 1, 2,-+-, be a sequence of 
continuous surfaces having representations [x,(u), $] of non-parametric 
origin (see I, 2), where 


Xn(u) = (u', u)) for (u', u) in %, n=0,1,2,-+-. 


Make the following assumptions: the functions x3(u) converge on 3 uni- 
formly to x3(u); each of the functions x3(u) for n=0, 1,2, ---is BV Ton $ 


1943} PARAMETRIC REPRESENTATION OF CONTINUOUS SURFACES 283 


(see III, 5); and the Lebesgue areas A(S,) converge to A(So) (see I, 6). Then 
the total variations V.:(x3), V.2(x3) converge to Vus(x3), Via(x3), respectively 
(see R? [2, chap. I, §19]). Observe that the first assumption implies that S, 
converges to So (see I, 1). 

2. Suppose that S is a continuous surface having a representation 
[x(u), ¥] for which the triple of ordinary Jacobians J(u; [x, $]) exists almost 
everywhere in $ (see IV, 1) and the Lebesgue area A(S) is finite; then the 
triple J(u, [x, $]) is summable on & (see IV, 10, 14). Assume that S,, =1, 
2, - + + , isa sequence of continuous surfaces having representations [x,(u), $] 
satisfying the following conditions: the functions x4(u) converge on $ uni- 
formly to x‘(u) for i=1, 2, 3; each of the representations [x,(u), $] for 
n=i, 2,---+, is absolutely continuous (A, J) (see I, 8); and the Lebesgue 
areas A(S,) converge to fg||J(u, [x, $])||dw. Then Rado and Reichelderfer 
(see R? [2, chap. I, §§25-27]) show that 


lim | J(u, | du = | J(u, [éx, | du for = 1, 2, 3. 


Since sequences of surfaces S, having the properties just described exist if 
and only if the representation [x(u), $] for S is absolutely continuous (A, J), 
they show that if [x(u), $] is absolutely continuous (A, J), then each of the 
representations [‘x(u), $] for the projection surfaces ‘S for i=1, 2, 3 is also 
absolutely continuous (A, J), and each of the transformations [‘x(u), 3°] 
for 1=1, 2, 3 belongs to the class K; described in II, 8. Observe that the first 
condition on the [x,(u), %] implies that S, converges to So. These results of 
Rado and Reichelderfer will appear as corollaries to more general theorems 
whch are presented in the wake of certain preliminary notions (see V, 8, 15). 

3. Let $ be any interval in the u-plane. A class § of intervals J in $ is 
termed closed if it possesses the following properties. 

1. The interval $ is in F. 

2 If J; and J, are in §, and if I,-J: is an interval, then J;-J2 is in §. 

3. If I is any interval in §, then there exists an interval subdivision 
S(3) in § which contains J as an element. 

A particular type of closed class is important for the sequel. Let £ be a 
class containing an at most enumerable number of lines(?), each of which is 
parallel to one of the coordinate axes in the u-plane, and none of which forms 
a side of the interval $. Denote by &() the class of all intervals J in 3, each 
of whose four sides is formed by a segment of a line not in {. It is readily 
verified that ¥() is a closed class. If £1, 2, - - - be a finite or enumerably 
infinite set of classes of lines each of type {, then the class § of all the intervals 
found in every one of the classes §(1), F(2), - - - is again a closed class 
having the same structure as §(2). A closed class of type §(£) is termed 
c-closed. 


(°) The class L may be empty. 


¢ 


284 P. V. REICHELDERFER (March 


Let ¥ be any closed class. Assume that to every interval J in § there is 
associated a finite real number ¢(J); this function is termed an interval 
function of §, and denoted by [¢, §]. For every interval IJ in &, define (cf. 4) 


u(I; [¢, = Lu.b. ¢(S(J)) for finite interval systems S(J) in &. 


Evidently ¢(J) Su(J; [¢,¢]) Su(9; [¢,]) for I in A necessary and suffi- 
cient condition that [u, €] be an interval function on § is that u(3; [¢, ¥]) 
be finite; if [u, ] is an interval function, then [¢, ¢] is said to possess a 
u-function. 

4. Let S be a continuous surface having a representation which is BV EZ; 
as noted in IV, 21, one may without loss of generality assume this representa- 
tion to be of the form [x(u), $], where $ is an interval. Let §([x, $]) denote 
the class of all intervals J in $ having the following property: for each of the 
four lines /, a segment of which forms the boundary of J, it is true that (cf. 
IV, 1) | ‘7(1- ET, 3°))| =0 for i=1, 2, 3 where ‘T is the transformation 
[‘x(u), $°]. From the corollary in II, 17, it is clear that §([x, $]) is a c-closed 
class. Let I be any interval in $([x, $]) and let S(I) be any interval sub- 
division in ¥([x, $]). Then (see II, 19; IV, 2, 21) S(J) is a maximal system 
for [x(u), ¥], so that 


1. = for i= 1,2,3; B(S(Z)) 2 (J). 


Let § be any c-closed subset of §([x, $]). Then for every interval J in &, 


there necessarily exists a sequence of interval subdivisions S,(J) in § such 
that l|.S.(2)|| converges to zero. From IV, 2, 21, one finds (see 4; V, 3), for Z 


in &, 


= = for i = 1, 2, 3; 


5. GENERAL LEMMA. Let where ba(I) = (‘bn(D), *n(D)) for 
Iin F,n=0,1,2,-+- , be a sequence of triples of interval functions defined on 
a closed class $. Set &,(I) =||¢.(J)|| for I in S. Make the following assumptions. 

1. Each is non-negative for i=1, 2,3; n=0,1,2,---. 

2. Each has a u-function for i=1, 2, 3;n=0,1,2,---. 

3. If I be any interval in &, if S(I) be any interval subdivision in S, then 
$,(S(D) = ‘bn(D) for i=1, 2,3; 2=0,1,2,---. 

4. lim inf = ‘$o(J) for I in $;i=1, 2, 3. 

5. lim u(3; ]) =u(3; F]). 

Then lim u(I, ¥]) =u(Z; F]) for I in F, i=1, 2, 3. 


For the special case when § consists of all the intervals in 3, this lemma 
is stated and proved by Rado and Reichelderfer (see R? [2, chap. II, §§1-8]). 
A proof for this slightly more general lemma may be made by using the prop- 
erties of a closed class (see V, 3), and following step by step their proof. 


2. 


| 


1943] PARAMETRIC REPRESENTATION OF CONTINUOUS SURFACES 285 


6. A central result in this chapter may now be stated and proved. 


THEOREM. Let S,, n=0, 1, 2,- +--+, be a sequence of continuous surfaces 
satisfying the following conditions. 

1. The surfaces S, converge to So (see I, 1). 

2. Each of the surfaces S, for n=0, 1, 2, - - - has a representation which is 
BV E (see IV, 1). 

3. The essential areas eA(S,) converge to eA(So). 

Then the essential variations eV(‘S,) for the projection surfaces ‘S, converge 
to eV(*So) for i=1, 2, 3. 


Proof. From conditions 1, 2; IV, 21; I, 1, it follows that there exist repre- 
sentations [x,(u), $] for S, each having the same interval of definition $, and 
each BV E, such that x,(u) converges uniformly on $ to xo(u). Let § denote 
the class of all intervals belonging to every one of the c-closed classes, 
$([xn, $]) for n=0, 1, 2, - - - (see V, 4); then § is a c-closed class. Define 
triples of b-functions [¢,, $] as in IV, 2 for n=0, 1, 2, - - - . From V, 4, 2, 
one obtains, for I in § (see III, 3; IV, 3), m=0, 1, 2,--- 

$)) u(I; for i = 1, 2, 3; 

U(I; [@, = u(Z; 
eV(S,) = F]) = (3) for i = 1, 2, 3; 

eA(S,) = u(%, 
Thus conditions V, 5, 1, 2 are satisfied. From V, 4, 1, it is clear that V, 5, 3 
is fulfilled. Condition V, 5, 4 follows at once from II, 2, 2 and the lemma of 
Fatou, since x,(u) converges on $ uniformly to xo(u). Finally, from relation 


4 and condition 3 follows condition V, 5, 5. The conclusion of this theorem 
thus follows at once from the iemma in V, 5 and relation 4. 


7. COROLLARY. Let S be a continuous surface having a representation which 
is BV E. If the Lebesgue area A(S) equals the essential area eA(S), then the 
Lebesgue areas A(‘S) of the projection surfaces *S equal the essential areas eA (*S) 
for +=1, 2, 3. 


Proof. Let P,, m=1, 2, - - - , be a sequence of polyhedra such that P, con- 
verges to S and A(P,) converges to A(S) (see I, 6, 2). Since A(P,) =eA i) 
and each P, has a representation which is AC E (see IV, 14) form=1, 2 - 
it follows that the hypotheses of the theorem in V, 6 are fulfilled by P, al s. 
Thus eV(‘P,) converges to eV(‘S) for i=1, 2, 3. But since ‘P, is a flat poly- 
hedron, it is true that eV(‘P,) =eA(*P,) =A(*P,) for i=1, 2, 3; 2, --- 
(see IV, 3, 14); also eV(‘S) =eA(‘S) for i=1, 2, 3. Since *P, converges to ‘S, 
it follows that (see I, 6, 3) 


A(‘S) S lim inf A(‘P,) = lim inf eV(*P,) = ¢A(‘S) for i = 1, 2, 3. 
The conclusion of the corollary follows from this inequality and IV, 14. 


286 P. V. REICHELDERFER [March 


8. Let S,, »=0, 1, 2, +++, be a sequence of continuous surfaces each 
possessing a representation [x,(u), %,] of non-parametric origin (see I, 2), 
where 


1. 2x,(u) = (u', u)) for u= in 3,, #2=0,1,2,--- 


Then (see IV, 15) the essential area eA(S,) equals the Lebesgue area A(S,) 
for n=0, 1, 2,- ++. Suppose that S, converges to So; it follows then that 
¥, converges to %o, and x3(u) converges uniformly on every closed set in the 
interior of %o to x3(u). Now assume that each of the Lebesgue areas A(S,) is 
finite form =0, 1,2, - - - ; this implies (see II, 7; IV, 16) that each of the repre- 
sentations [x,(u), $,] is BV E. Thus to the theorem in V, 6 there follows the 


Coro.iary. Let S,, n=0, 1, 2, +--+, be a sequence of continuous surfaces, 
each possessing a representation 1 of non-parametric origin and a finite Lebesgue 
area A(S,). If the surfaces S, converge to So, if the areas A(S,) converge to A(So), 
then the variations eV(*S,) for the projection surfaces *S, converge to eV(*So) for 
¢=1, 2, 3. 


In view of the lemma in III,8 (see IV, 16), this result is clearly a generaliza- 
tion of that of Rado and Reichelderfer cited in V, 1. 
9. Asecond important result in this chapter is contained in the 


THEOREM. Let S,, n=0, 1, 2,-+-+, be a sequence of continuous surfaces 
satisfying the following conditions: 

1. the surfaces S, converge to So (see I, 1); 

2. the surface So has a representation [xo(u), Bo] for which the triple 
F(u, [x0, Bo]) of generalized Jacobians exists almost everywhere in Bo, is sum- 
mable on Bf (see IV, 1); 

3. the surfaces S, for n=1, 2, - - - have representations which are AC E; 

4. the essential areas eA(S,) converge to [x0, Bo])||du. 

Then 

5. the representation [xo(u), Bo] is AC E; 

6. the representation |xo(u), Bo] is absolutely continuous (eA, F) (seel, 8); 

7. the essential variations eV(‘S,) for the projection surfaces *S, converge to 
eV(‘So) for i=1, 2, 3. 


In view of the theorem in IV, 9, it is clear that conclusion 5 implies con- 
clusion 6; if conclusions 5, 6 are true, then the hypotheses of the theorem in 
V, 6 are fulfilled, so conclusion 7 follows. It suffices therefore to prove 5. This 
proof is divided into two parts: an “assume without loss of generality” sec- 
tion (V, 10), and the proof itself (V, 11). 

10. No loss of generality is imposed in the preceding theorem if the fol- 
lowing additional assumptions are made: 

1. the simple Jordan regions %, fill up 8» from the interior (see 4); 


1943] PARAMETRIC REPRESENTATION OF CONTINUOUS SURFACES 287 


2. on every closed set in the interior of Bo, x,(u) converges uniformly to 
xo(u). 

In proving this, a theorem of Franklin and Wiener is useful (see Franklin 
and Wiener [1]). Given a topological map [#(u), 8] of a Jordan region % in 
the u-plane onto a Jordan region § and a positive constant ¢, there exists a 
pair of analytic functions [#,(u), R] defining a topological map of some Jordan 
region ® in the u-plane containing % in its interior onto a Jordan region R 
containing § in its interior, and such that || #.(12) — 22(x)|| <e for u in 8 and 
—a-"(u)|| for u in B, where [a—(u), B], R] are the in- 
verse maps of [a(u), 8], [@.(u), R], respectively. Let B, denote the correspond 
to under the map [a#-'(u), Then [#(#.(u)), B.] is a topological map 
of 8, onto such that || <e for u in Suppose that S is a 
surface having a representation [#(u), 8] which is AC E; consider the repre- 
sentation [#(#,(u)), B.] for S. Denote by M the maximum of the absolute 
value of the ordinary Jacobian J(u, [#., 8,]). Then a simple Jordan region B 
in § which is the image of a square s in 8, under [#,(u), B.] has an area not ex- 
ceeding M-|s|. Let *7, *T denote transformations [‘#(u), 8], [‘#(#.(u)), B.] 
respectively, for i=1, 2, 3. Then (see II, 1-9), since [#(x), $] is AC E, 


G(s, ‘T, E(*T, B%)) = | *7(s*- E(‘T, B%) | =| ET, B | 
S | K(ix, ‘T, Bd ix = D(u, ‘T)du for i = 1, 2, 3. 
f (ix )d *x (u, *T)du for i 


Thus [#(#,.(u)), B.] is also AC E. Now the representation [£(a(u)), 8] need 
not be AC E. And if [x(u), 8] is an arbitrary preassigned representation for 
S, and £ is any positive number, then [#(u), 8] may be so chosen that 
|| 2(a(u)) —x(u)]|| <f for win B (see I, 1). These results are summarized in the 


LEMMA. Let S be a continuous surface possessing a representation which is 
AC E. If [x(u), 8] is an arbitrary representation for S, and if «and ¢ are any 
positive numbers, then there exists a simple Jordan region B,, a topological map 
[u.(u), Be] of B, onto B such that ||u.(u)—ull <e for u in B., and an AC E 
representation [x.(u), B.] such that ||x.(u) —x(u.(u))|| <f for uin B,. 


Choose positive numbers such that —xo(u"’)|| <n for any 
points u’, u”’ in Bp satisfying ||u’—u’"|| <e,. Let B, be a Jordan region in the 
interior of 8» for which there exists a topological map [u,(u), B, | of B, onto 
Bosuch <e,foruinB,,m=1,2,---. —xo(u) | 
<n. Thus the simple Jordan regions B, fill up 8» from the interior (see 4), 
the surfaces So, having representations [xo(u), Bn] satisfy d(So, Son) < n—, 
and since clearly ¢A(Son)SeA(So), it follows that: eA(So,) converges to 
eA(So) (see IV, 5). Now d(Sn, Son) <d(S,, So)-+n-! and so the surfaces S, 
admit representations [#x,(u), B,] for which || (14) —xo(u)]| <d(Sn, So) 
for u in B,, m=1, 2, - - - (see I, 1). Since the surfaces S, have representations 


288 P. V. REICHELDERFER [March 


which are AC E for n=1, 2, - - - (see V, 9, 3), and since B, is in the interior 
of Bo, it follows by the preceding lemma that there exist simple Jordan regions 
in Bo, topological maps [+un(u), of +B, onto B, such that || 
—u||<e, for in and AC E representations [sx,(u), such that 
|| — <n for u in «B,. Since —xo(u)]| <n for 
u in *B,, it follows that || —xo(u)|| <d(S,, So) +3n— for uin +B,. The 
representations [+x,(u), «8,] thus satisfy the hypotheses of the theorem in 
V, 9 and the additional assumptions in this section. 

11. A proof for the theorem in V, 9 is now made, using the additional con- 
ditions, V, 10, 1, 2. First, observe that V, 9 imply (see IV, 5, 6) that eV(So) 
is finite, and 


lim A(S,) = = [0 
This verifies V, 9, 6 directly (see I, 8). Let J be any interval in the interior of 


Bo; in view of V, 10, 1, there exists an (J) such that J is in the interior of %, 
for n>n(I). Define (see IV, 1), for I in $8. 


y,(I) = f | ¥(u, *Tn) | du for i = 1, 2, 3; 
= (WT), WD), = for = 0, m > 
From V, 9, 3, it is seen that (see II, 9; 1, IV, 3) 


Y,(I) = f K(‘x, *T,, )d‘x fori = 1, 2,3,” > n(J). 


Since ¢A(So) is finite, it follows (see IV, 4) that K(‘x, ‘J, I*) is summable, 
and (see V, 10, 2; II, 2, 2; II, 9, 1) 


lim inf f K(‘x, *T,, ‘x = f ‘x = Yo(I) for + = 1, 2, 3. 

These relations give 

1. lim inf = ‘Yo(I) for i = 1, 2, 3, Zin Bo. 

By a known result (see R? [2, chap. II, §10]), it follows that for I in B§, 
[Yas T)) = fori = 1, 3, 


u(I; Z]) = [xn, B,])|| du for nm = 0,” > n(J). 


A direct reasoning using relation 1 shows that 


3. lim inf u(I; Z]) = u(Z; for I in Be. 


1943] PARAMETRIC REPRESENTATION OF CONTINUOUS SURFACES 289 


For »=0, n>n/(J), the interval J lies in the interior of Bo; extend the sides 
of I until they meet the boundary of %, thus dividing %, into nine simple 
Jordan regions T=oBn, - - - , sBn. Clearly the fill up ,Bo from the 
interior for h=0, - - - , 8 (see IV, 10, 1). Denote by 4S, the surface having 
the representation [x,(u), .B.] for h=0,---, 8, 2=0, m>n(I). Since the 
representations [x,(u), .B,] for n>mn(J) are AC E and the essential areas 
eA(,So) are finite for h=0, - - - , 8, it follows that (see IV, 6) 


eA(,S,) = f || Bx ])|| for n > n(1); 


eA(,So) = f [xo, Bo])||du  forh=0,---,8. 


From V, 10, 2 it follows that ,S, converges to ,So for h=0, - - - , 8, so that 
(see IV, 5) 


lim inf ¢A(,S,) 2 eA(nSo) forkh=0,---,8. 


Since straight line segments form the subdivision of %, just introduced, one 
obtains 


eA(S,) = Jolie Bn ])||du = forn > n(I). 


Using the preceding relations and V, 9, 4, one finds 


lim sup ¢A(oS,) = lim sup E (S,) — eA | < Bo]})|| du. 


In view of relations 2, 3, 4, this implies that 
5. lim 1]) = u(Z; [%o, for J in Bo. 


Thus, if $ be any fixed interval in 8 and § be the class of all intervals J in 
§, it is clear from relations 1, 5 that the [y,, §] for n=O, n>n(3) satisfy 
the hypotheses of the general lemma in V, 5. In view of relation 2, therefore, 


tim | au = f | | du for I in Bo, i = 1, 2, 3. 
I I 


This relation, together with the conditions in V, 10, makes it clear that the 
hypotheses of the modified closure theorem in II, 14 are fulfilled, for the three 
sequences of transformations ‘T7,: [‘x,(u), 82], 1, 2,---, where i=1, 
2, 3. So ‘Ty is AC E(‘To, B8) for i=1, 2, 3—that is, [xo(u), Bo] is AC E (see 
IV, 1), and the theorem is established. 

12. The theorem just proved permits the following addition to the results 
in IV, 6-9. 


4. 


290 P, V. REICHELDERFER [March 


THEOREM. Assume that for a continuous surface So there exists a sequence of 
surfaces S, such that S, converges to So, eA(Sn) converges to eA(So) which is 
finite, and each S, for n=1, 2, - - - has an AC E representation. Then a neces- 
sary and sufficient condition that a representation [xo(u), Bo] for So be absolutely 
continuous (eA, F) is that [xo(u), Bo] be AC E. 


Proof. That the condition is sufficient is already established (see IV, 9). 
So suppose that [xo(u), Bo] is absolutely continuous (eA, ¥); then (see I, 8) 


eA(Ss) = [o, Bo))||du. 


The S, for n=0, 1, 2, - - - thus satisfy the hypotheses of the theorem in V, 
9, whence it follows that [xo(u), Bo] is AC E. 

13. If in the theorem in V, 9, the generalized Jacobians are replaced by 
the ordinary Jacobians, there results the 


THEOREM. Let S,,n=0, 1,2, - - - be a sequence of continuous surfaces satis- 
fying the following conditions: 

1. the surfaces S, converge to So (see I, 1); 

2. the surface So has a representation [xo(u), Bo] for which the triple 
J(u, [xo, Bo]) of ordinary Jacobians exists almost everywhere in 8$, and is 
summable on Bo (see IV, 1); 

3. the surfaces S, for n=1, 2, - - - have representations which are AC E; 

4. the essential areas eA(S,) converge to Seal| J(u, [xo, Bo])||du. 

Then 

5. the representation [xo(u), Bo] is AC E; 

6. the representation |xo(u), Bo] is absolutely continuous (eA, J); 

7. the essential variations eV(*S,) for the projection surfaces ‘S, converge to 
eV(‘So) for i=1, 2, 3. 


A proof may be made by paralleling the proof for the theorem in V, 9, us- 
ing the modified closure theorem in II, 15. 

14. The preceding theorem permits the following addition to the results 
in IV, 10-13. A proof is similar to that in V, 12. 


THEOREM. Assume that for a continuous surface So there exists a sequence of 
surfaces S, such that S, converges to So, eA(S,) converges to eA(So) which is 
finite, and each S, for n=1, 2,- ++ has an AC E representation. Then a neces- 
sary and sufficient condition that a representation [xo(u), Bo] for So, for which 
the triple S(u, [xo, Bo]) of ordinary Jacobians exists almost everywhere in Bp, 
be absolutely continuous (eA, J) is that [xo(u), Bo] be AC E. 


15. CoROLLARY. A necessary condition that a representation [x(u), B] for 
a continuous surface S be absolutely continuous (A, J), where A(S) is the 
Lebesgue area of S and J(u, [x, B]) is the triple of ordinary Jacobians is that 


i 


1943] PARAMETRIC REPRESENTATION OF CONTINUOUS SURFACES 291 


[x(u), B] be AC E; if [x(u), B] is absolutely continuous (A, J), then the 
Lebesgue area A(S) equals the essential area eA(S). A sufficient condiiion that a 
representation [x(u), B] for S be absolutely continuous (A, J) is that [x(u), B] 
be AC E and eA(S) =A(S). 


Proof. The second assertion in this corollary has been established in IV, 
14. According to I, 6, 2, there exists a sequence of polyhedra P, which con- 
verge to S and for which A(P,,) converges to A(S). Now (see IV, 14) each of 
the P, has an AC E representation, and eA(P,)=A(P,) for m=1, 2,---. 
The remainder of the corollary now follows at once from the theorem in V, 
14. 

The results of Rado and Reichelderfer cited in V, 2 are seen to be a special 
case of this corollary and of the theorem in V, 9 (see II, 15; III, 4, 2; V, 7). 


BIBLIOGRAPHY 
P, FRANKLIN AND N. WIENER 
1. Analytic approximations to topological transformations, Trans. Amer. Math. Soc. vol. 28 
(1926) pp. 762-785. 
Z. DE GE6CZE 
1. Quadrature des surfaces courbes, Mathematische und naturwissenschaftliche Berichte aus 
Ungarn vol. 26 (1908) pp. 1-88. 
G. H. Harpy, J. E. LirrLewoop, G. Potya 
1, Inequalities, Cambridge, 1934. 
B. von KEREKJARTO 
1. Vorlesungen tiber Topologie, vol. 1, Berlin, 1923. 
C. KuRATOWSKI 
1. Topologie. 1, Warsaw-Lwéw, 1933. 
T. 
1. Sur l’aire des surfaces courbes, Acta Univ. Szeged vol. 3 (1927) pp. 131-169. 
2. Uber das Flachenmass rektifisierbarer Flichen, Math. Ann. vol. 100 (1928), pp. 445-479. 
3. On the problem of Plateau, Ergebnisse der Mathematik und ihrer Grenzgebiete, Berlin, 
1933. 
4. A lemma on the topological index, Fund. Math. vol. 27 (1936) pp. 212-225. 
T. Rapo AND P. REICHELDERFER 
1. A theory of absolutely continuous transformations in the plane, Trans. Amer. Math. Soc. 
vol. 49 (1941) pp. 258-307. 
2. On convergence in length and convergence in variation, accepted for publication by the 
Duke Math. J. 
S. Saxs 
1. Theory of the integral, Warsaw-Lwéw, 1937. 
L. TONELLI 
1. Sulla quadratura delle superficie, Rendiconti delle sedute della reale accademia nazionale 
dei Lincei, Classe di Scienze fisiche, matematiche e naturali, vol. 3 (1926) pp. 357-362; 
pp. 445-450; pp. 633-638. 


UNIVERSITY OF CHICAGO, 
Cuicaco, ILL. 


. 


A THEORY FOR ORDINARY DIFFERENTIAL BOUNDARY 
PROBLEMS OF THE SECOND ORDER AND OF THE 
HIGHLY IRREGULAR TYPE 


BY 
RUDOLPH E. LANGER 


1. Introduction. The boundary problems with which this discussion is 
concerned may be given in either the form 


+ { + pro(x) } 

+ { poa(x)a? + paila)d + prox) } u(x) = 0, 
(a) + + + hia(A)u(b) = 0, i= 1,2, 
in which the coefficient functions p(x) and the solution u(x) are scalars, or 
in the form 


(1.1) 


(1.2) u’(x) = {APi(x) + Bo(x) }u(x), 
Ga(A)u(a) + Se(A)u(d) = o, 


in which the German capital letters designate matrices of the order two and 
the solution u(x) is a vector, that is, a matrix of two rows and one column. 
In either form the parameter X is to be taken as complex and unbounded, 
while the variable x is to be taken as real and on the finite interval (a, 3). 
On this interval the coefficients p(x), or the elements of the matrices B;(x), 
are assumed to be differentiable, and such that the functions 7(x) which, in 
the case of the system (1.1), satisfy the equation 


(1.3) + pi(x)r(x) + poo(x) = 0, 
or, in the case of the system (1.2) make the matrix 
(1.4) {Bi(x) — r(x) 3} 


singular, fulfill conditions to be stated below in §2. The coefficients hy.(A) of 
the boundary relations in (1.1), or the elements of the matrices §,(A) and 
H,(A) in (1.2), as the case may be, are to be polynomials in A of any degree, 
and may, of course, in particular be constants. 

Any boundary problem of this type is compatible either for all values of » 
or for no such value, or for a certain set of characteristic values which is 
finite or denumerably infinite. This discussion is primarily concerned with the 
latter case. With an infinite set of characteristic values, there exists, then, 


Presented to the Society, September 2, 1941 under the title A theory for ordinary linear 
differential boundary problems of highly irregular type; received by the editors July 13, 1942. 


292 


DIFFERENTIAL BOUNDARY PROBLEMS 293 


an associated set of characteristic solutions, and by familiar procedures an 
infinite series of these solutions may be associated with an “arbitrary” func- 
tion or vector. The series is then designated as a formal expansion of that 
function or vector, and the latter is in turn designated as the generating 
element of the expansion. 

Contingent upon the fulfillment of certain more or less general conditions 
by the generating element, the behavior of an expansion, that is, its diver- 
gence, convergence, summability, value, and so on, is essentially determined 
by the boundary problem itself, specifically by the character of the adjust- 
ment which maintains between the differential equation and the boundary 
relations. This adjustment has, therefore, been made the basis for a classifica- 
tion of boundary problems into categories which are identified by the desig- 
nations “regular,” “mildly irregular” and “highly irregular.” 

For boundary problems of the regular type a relatively complete and 
familiar theory exists. The formal expansions include the classical Fourier’s 
series as special cases, and have, broadly speaking, the salient properties of 
these series. Thus, in particular, they converge to the value of the generating 
element in an appropriately conventional sense whenever this element is 
integrable over the fundamental interval, and is of bounded variation in some 
neighborhood of the point under consideration. 

Though less has been written upon boundary problems of the mildly 
irregular type('), the state of their theory is roughly comparable with that 
of the theory of regular problems. In general the forma! expansions are diver- 
gent, but are summable by means of familiar type to the values of the gener- 
ating elements. 

By contrast with this, nothing that may properly be referred to as a 
general theory has heretofore been given for boundary problems of the 
highly irregular type. In these the classical methods apparently lead into 
insurmountable difficulties, and simple examples show that these difficulties 
are not due to the methods alone. The literature on problems of this type is, 
therefore, scant. Only problems which are markedly specialized and sym- 
metrical have been analyzed at all(?), and in them, even in the face of their 
specialization, the results obtained are only in slight measure comparable 
with those of the theory of regular problems. Existing discussions, far from 


() Cf. M. H. Stone, Irregular differential systems of order two and the related expansion 
problems. Trans. Amer. Math. Soc. vol. 29 (1927) pp. 23-53. R. E. Langer, The expansion prob- 
lem in the theory of ordinary linear differential systems, Trans. Amer. Math. Soc. vol. 31 (1929) 
pp. 868-906. 

(?) Cf. J. W. Hopkins, Trans. Amer. Math. Soc. vol. 20 (1919) pp. 245-259. L. E. Ward, 
Trans. Amer. Math. Soc. vol. 29 (1927) pp. 716-745, ibid. vol. 32 (1930) pp. 544-557, ibid. vol. 
34 (1933) pp. 417-434, Ann. of Math. (2) vol. 26 (1925) pp. 21-36, and Amer. J. Math. vol. 57 
(1935) pp. 345-362. In all of these the differential equation of the problem is of a form included 
in d*u/dx*+ {d"-+¢(x) }u=0, 223. Also, J. I. Vass, Duke Math. J. vol. 2 (1936) pp. 151-165, 
in which the differential equation is d*u/dx*—2d cos (px/q) -du/dx+-r*u=0. 


294 R. E. LANGER © [March 


applying to such formal expansions as have arbitrary generating elements, 
have been restricted to cases in which these elements are analytic as functions 
of the complex variable, and beyond that are of certain distinctive and 
extremely special structures. 

The present paper is based upon a wholly different mode of approach to 
the problem. Its method is, in brief, the imbedding of the highly irregular 
problem in a continuous family of boundary problems of which all other 
members are regular. The given problem is thus approached through limiting 
considerations applied to existing theory. It is found on this basis that a sub- 
classification of the highly irregular problems into two virtual sub-categories 
is requisite. For problems of the first sub-category, and this includes all 
problems of the second order and highly irregular type that have been dis- 
cussed heretofore at all, a theory is derived which is in many respects closely 
aligned with the existing theories for regular and mildly irregular problems. 
Though the expansions are non-convergent, they are shown to be summable, 
in certain specifically defined senses, to the values of the generating elements, 
whenever these latter fulfill conditions such as are familiarly imposed in the 
theory of Fourier’s series. For problems of the second sub-category no results 
are derived, and it seems improbable that any expansion properties as con- 
ventionally understood inhere in problems of this type. 

The discussion has been restricted to boundary problems of the second 
order. The motive for this, however, is to be sought only in the desire to keep 


the paper within its present bounds. The method set forth is evidently more 
generally applicable. 


CHAPTER 1 


THE GIVEN BOUNDARY PROBLEM 


2. The normalization of the differential equation. The forms of the bound- 
ary problems (1.1) and (1.2) remain unchanged under any integral linear 
change of the independent variable x. Since a suitable change of this kind 
reduces the interval (a, b) to the interval (0, 1), it may be assumed without 
loss of generality, that a=0 and )=1. In the following this will be done. 
To obviate the incidence of complications which are not germane to the 
matter essentially at issue, it will be assumed that on this interval the coeffi- 
cients pir(x), R=1, 2; J=0, 1, 2; if the boundary problem is given in the 
form (1.1), or the elements of the matrices $:(x) and $o(x), if the problem 
is given in the form (1.2), are differentiable to any desired order, or at least 
to such orders as may be effectively called for. 

If the boundary problem as given is in the form (1.1), let r:(x) and ro(x) 
designate the roots of the equation (1.3), and let it be supposed that these 
roots remain distinct on the interval (0, 1). The equations 


, 
qi 


qi 


DIFFERENTIAL BOUNDARY PROBLEMS 


= putpu—> 
q1 


then define the functions qi /q: and ¢’/@, and if gz is determined from the 
formula 


¢’ ¢” 
= — (p+ po —+ —), 
9192 (ps pro $ 


it is found that the equation obtained from the system 
yi (x) = + 
yz (x) = + 
by the elimination of the function y2(x), is identical with that obtained from 
the differential equation in (1.1) by making the substitution u(x) =¢(x)4:(x). 


This substitution, together with the first of the equations (2.1), reduces the 
boundary relations of (1.1) to the forms 


(2.2) + + + = 0, = 1, 2, 
in which the coefficients v;,(A) are again polynomials in \. The differential 


system (1.1) is thus reducible to the form (2.1), (2.2). This latter may be 
conveniently written in matrix form, thus 


y’(x) = {AR(x) + Q(x) } 
BO A)y(0) + VBA)y(1) = o, 


(2.1) 


(2.3) 


in which 
(x) ri(x) 0 0 g(x) 
BOA) = = 


If the boundary problem as given is in the form (1.2), let 7:(x) and 72(x) 
be the roots of the determinant equation 


(2.4) 


pu pis (2) 
pn (2) bu (2) — 
in which (p{?(x)) =1(x), and let it be supposed that these roots are distinct 
on the interval (0, 1). The nonsingular matrix &(x) which fulfills the relation 
Pi(x)X(x) = T(x) R(x) 
then exists, and the substitution u(x) = T(x)w(x) gives to the differential 


(*) Throughout the paper German capital letters will be used to designate square matrices 
of order two. Lower case German letters will correspondingly be used to designate vectors of 
two components, and Latin letters to denote scalars. 


1943] 295 
- 


296 R. E. LANGER 


equation in (1.2) the form 


ww’ (x) = {AR(x) + Bo(x) } w(e), 


with 
Po(x) = Po(x)X (x) — X’(x)}. 
The further substitution 
= 


reduces the equation to the differential equation in (2.3), and in fact the entire 
boundary problem (1.2) to the form of (2.3). Since the problem, whether given 
in the form (1.1) or in the form (1.2) is thus reducible to (2.3), the further 
considerations may be confined to this latter form. 

It has already been assumed that the functions r:(x) and r2(x) are distinct 
on the interval (0, 1). Further restrictions upon these functions, which are 
imposed in all existing theories of boundary problems (2.3) when the inde- 
pendent variable is real(®), and which are now also to be imposed herewith 
upon the present discussion are the following: 


HypoTHEsis 1. On the interval (0, 1) the functions ri(x), re(x), and 
{ r1(x) —12(x) 5 are bounded from zero, and each of them is real except possibly 
for a constant complex factor. 


There are essentially two types of configuration which conform to this 
hypothesis, namely: 
Configuration 1, 


(2.5) = op(x), j= 1,2, 


in which o; and @: are constants different from zero with a ratio which is net 
real, and p(x) is a real positive function which is bounded from zero; and 
Configuration 2, 


(2.6) r(x) = op,(x), j = 1, 2, 


in which ¢ is a non-vanishing constant, and p(x), p2(x), are real functions 
which are bounded from each other and from zero. 

To this point no stipulation has been made as to the assignment of sub- 
scripts to the functions 7(x) and r(x). It is convenient to assign the sub- 
scripts now and henceforth in such a way that in the event of the configura- 
tion 1, the value of {arg ri(x) —arg r(x) } lies between 0 and =. If the relations 


(*) Through the paper 5;;=0 if #7 and 5;;=1 if i=]. 

(®) For the theory when x ranges over a region of the complex plane cf. R. E. Langer, The 
boundary problem of an ordinary linear differential system in the complex domain, Trans. Amer. 
Math. Soc. vol. 46 (1939) pp. 151-190. 


_ 1943] DIFFERENTIAL BOUNDARY PROBLEMS 


r(x) = f j=1,2, 
0 


= T,(1), 


(2.7) 


are used to define their left-hand members, it follows then in the event of 
configuration 1, that 


(2.8) 0 < arg I; — arg I; < =. 


In the event of the formulas (2.6) the value of {arg ri(x) —arg r2(x) } is a 
multiple of z. By a suitable assignment of subscripts in this case of con- 
figuration 2, one or the other of the sets of relations 


(a) arg — arg = 0, | >| T;| 


2.9 
(b) arg — = 2, | Ti] =| rl, 


may therefore be achieved. It will be assumed in the following that the sub- 
scripts have been so assigned. 

3. The boundary conditions. The components of the vector boundary rela- 
tion of the system (2.3) are given explicitly in the equations (2.2), and in this, 
as has already been noted, each coefficient »;:(A) is a polynomial in A, which 
may in particular vanish. Individually these relations are, of course, not 
uniquely specific, since they may be replaced by any independent linear 
combinations of the two without any modification of the content of the 
conditions as a whole being thereby induced. In the vector form of the condi- 
tion, as it appears in (2.3), such a replacement is accomplished by the multi- 
plication of the relation on the left by some nonsingular matrix, and con- 
versely any such multiplication by a nonsingular matrix is of merely formal 
effect. 

If in either of the relations (2.2) all four coefficients »;;(A), 7=1, 2, 3, 4, 
have some factor (A—A ) in common, the boundary problem is compatible 
at Ao. The same, but no more, follows if they have (\—Xo) as a multiple 
common factor. A reduction of the multiplicity of such a factor is, therefore, 
a permissible formal simplification, and in proceeding it will be assumed 
that such simplifications have been made, so that any common factor of the 
coefficients »;,(A), with ¢=1 or 1=2, is a simple factor of at least one of them. 

If the two relations (2.2) are linearly dependent identically in \, the 
boundary problem is permanently compatible, and, from the point of view 
of this discussion, is without interest. That case is, therefore, to be excluded 
by the assumption that of the matrices 


Dia(A) 
= 
(3.1) oO) Der(A) 


at least one is not identically singular. 
Let the maximum degree of the polynomials »;;(A) be designated by r. 


). h,l = 1, 2,3,4;4 #1, 


297 


298 R. E. LANGER 


The matrices (3.1) are, then, all expressible in the polynomial form 


= (*.2,0) + ACM + eee ATE 
1 


with each symbol € standing for a constant matrix, and with 
(4.1.7) D, 


for some indices (h, /). If the matrices €"-*) are not all singular, let 72=7. 
Otherwise, let a set of constant elements s;; be determined such that the 
matrix (s;;) is nonsingular, whereas 


(i) (su, 0, 

for some (h, /), 

(ii) (Ser, Sao) = 0, 

for all (h, 1). Then let 72 be defined as the least integer for which 


for all (h, /)(®). Because of the assumption made above relative to the matrices 
(3.1), 220. If the boundary relation of the problem (2.3) is, then, multiplied 
by the matrix (s;;), and if thereafter the matrices (s;;)B (A), 4=0, 1, are 
again denoted simply by ¥ (A), it follows that each element of these matrices 
is a polynomial in A; and that when 7;=7, then 7; is the maximum degree of 
the elements in an ith row. 

It may be noted now that either one of the integers 7; and Tz may be in- 
creased by unity by the multiplication of the boundary relation on the left 
by the respective matrix 


- 


in which \, and )¢ are any values of the parameter for which the boundary 
problem is initially incompatible. It is thus a matter of an adjustment of 
the boundary problem to assure the relations 


(3.2) 721, += 1, 2. 


It must be noted, however, that this adjustment (which plays no role except 
in §28) is not wholly formal, for since the matrix factors used in achieving 
it are singular for a value of \, that value is introduced as a characteristic 
value by the adjustment. 

Finally, in virtue of the structure of the matrices @(A) and BA) as 


(*) For the purposes of multiplication vectors are always to be regarded as matrices, of one 
row and two columns if they are left-hand factors, and of two rows and one column if they are 
in the role of right-hand factors. 


[March 


1943] DIFFERENTIAL BOUNDARY PROBLEMS 299 


now arranged, it will be clear that the matrices W (A), 4=0, 1, as defined 
by the formulas 


BWA) = A*)BVA), 
WA) = 


have elements which are polynomials in (1/A), at least one element in a first 
row, and at least one element of a second row having a constant term which 
is not zero. 

4. The solutions of the differential equation. Under the Hypothesis 1 of 
§2, the forms and structural properties of the solutions of the matrix differ- 
ential equation 


(4.1) Y’ (x, A) = {AR(x) + Q(x) } D(x, 


and hence of the vector differential equation of the boundary problem (2.3) 
may be regarded as known(’), especially insofar as large values of the param- 
eter are concerned. Certain of these properties are relevant to the discussion 
at harid and may be cited as follows. 

(i) With the matrix (x, A) defined by the formula 


(4.2) E(x, A) = (8:7), 
and with (x) defined to be identically the unit matrix, explicit formal 


procedures may be applied to determine successively the matrices of a se- 
quence f(x), h=1, 2, 3, - ++, so that the expression 


(4.3) { > Gee 
h=0 


formally satisfies the equation (4.1), namely so that upon substitution of the 
expression (4.3) in the place of 9)(x, A) in the equation (4.1), the coefficients 
of like powers of \ in the two resulting members of the relation are in every 
case equal. 

(ii) The infinite series (4.3) is in general divergent. However, to each » 
half-plane of the set defined by the relations 


(4.4) (m — 1/2)" — arg {T; T:} S arg\ S (m + 1/2)x — arg F:}, 


(3.3) 


for integral values of m, there corresponds an actual analytic solution of the 
equation (4.1) which is asymptotically represented by the expression (4.3) 
for the values of \ in that half-plane. 

(iii) In terms of any analytic nonsingular solution 9)(x, A) of the equation 
(4.1) the general solution of that equation, and the general solution of the 


(7) Cf. G. D. Birkhoff, and R. E. Langer, The boundary problems and developments associated 
with a system of ordinary linear differential equations of the first order. Proceedings of the Ameri- 
can Academy of Arts and Sciences vol. 58 (1923) pp. 51-128. 


300 R. E. LANGER . [March 


vector differential equation of the boundary problem (2.3), are given, respec- 
tively, by the formulas 


in which €™ and c™ are an arbitrary matrix and an arbitrary vector that are 


independent of x. 
If in the first of the expressions (4.5) the matrix € is written as 


Y-*(0, A)G, the general solution of the equation (4.1) is expressed in the form 
(4.6) Y(x, 


In this form the solution involved is wholly determined by the matrix €, 
since the form (4.6) is invariant under the substitution of any one nonsingular 
solution 9)(x, A) for any other one. The general solution of the vector equation 
(2.3) may be similarly given by the formula 


(4.7) Y(x, 


Any specific one of the solutions 9)(x, 4) to which the statement (ii) above 
refers, defines through the formula 


(4.8) Y(x, A) = Bix, d), 


a matrix B(x, A) which is analytic in A, and which, by (ii), is such that for X 
in the respective half-plane of the set (4.4), the relation 


(4.9) P(x, > (ae) 


maintains. From this it is evident that the matrix in question is nonsingular 
when |r| is sufficiently large, and that, therefore, it may be used in the role 
of 2)(x, X) in the formulas (4.6) and (4.7). The asymptotic representation of 
the solution (4.6) or (4.7) determined by any specific matrix € or vector ¢ is 
thus obtainable from the relations (4.8) and (4.9). This representation is valid 
for all large values of \, despite the fact that the matrix B(x, A) to which the 
relation (4.9) applies is different in different half-planes (4.4), precisely by 
virtue of the fact that the formulas (4.6), (4.7) are invariant under replace- 
ments of the solution 9)(x, d). 


CHAPTER 2 


THE FAMILY OF BOUNDARY PROBLEMS 
5. The formal construction and characteristic equation of the family. Let 
Ku, t=1, 2;/=1, 2, 3, 4, be a set of constants, which for the instant may re- 
main unspecified, and let v be taken as a parameter whose range is to include 
the value zero. The formulas 
BOA, v) = VOO) + 


(5.1) BOA, v) = BOA) + 


\ 


1943] DIFFERENTIAL BOUNDARY PROBLEMS 301 


then define the matrices which appear as their left-hand members, the ele- 
ments of these matrices being polynomials in \ and linear polynomials in ». 
The differential system 


v) = {AR(x) + Q(x) }y(z, »), 
v)y(0, + VBA, v)y(1, ») = o, 


then defines a family of boundary problems which yields the given problem 
(2.3) for the parameter value vy =0. 
If, on the pattern of the formulas (3.3), the matrices W™ (A, v), h=0, 1, 


are now defined by the formulas 

(5.3) BWMA, = (FA) BMA, 7), 0, 1, 
it follows from (3.3) and (5.1) that 

WA, vy) = WA) + (Kis), 

WHA, v) = WA) + r(Ki, 442). 


The elements of these matrices are therefore polynomials in (1/A), of which 
the constant terms are linear in v and all other terms are independent of v. 
Moreover, at least one element in a first row and at least one element in a 
second row has a constant term that does not vanish when v=0. 

The general solution of the differential equation of the problem (5.2) is 
given by the expression (4.7), a non-trivial solution being associated with a 
non-vanishing vector c. Upon substitution of this expression into the bound- 
ary relation, the latter assumes the form 


(S.5) DA, v)Y-*(0, = 0, 
in which 
(5.6) DA, v) = v)Y(O, A) + BMA, v)Y(i, A). 


The condition that there exist a non-vanishing vector ¢ to satisfy the equation 
(5.5), and hence that there exist a non-trivial solution (4.7) of the boundary 
problem is, therefore, evidently that the matrix (5.7) be singular, namely that 


(5.7) DQ, v) = 0, 


where D(A, v) denotes the determinant of the matrix (5.6). The compatibility 
of the boundary problem corresponding to any specific value of v, is thus 
contingent upon J being a root of the characteristic equation (5.7). These roots 
are called the characteristic values. 

It will be noted that the matrix D(A, v), and hence also its determinant 
DA, v), depends upon the choice of the nonsingular solution 9)(x, A) of the 
equation (4.1) which appears in (5.6). However, as has already been observed, 
any product 9)(x, A)J-(0, A) is independent of the solution }(x, A) from 


(5.2) 


(5.4) 


302 R. E. LANGER - 


which it is formed. From this it is seen at once, that the product 
(S.8) DA, v)Y-*(0, d) 


is invariant, and that the left-hand members of the equations (5.7) formed 
from different solutions 9)(x, A) differ only in their non-vanishing constant 
factors. The characteristic values, as roots of the equation (5.7), are thus 
independent of the choice of 9)(x, A). 

Since the determinant D(A, v), when formed from a solution 9)(x, A) that 
is analytic in X, is itself analytic, the number of characteristic values in any 
bounded portion of the complex A-plane, and hence in particular within any 
circle however large, is evidently finite. For the consideration of those roots 
which lie outside of a suitably large circle, it is convenient to construct the 
equation (5.7), for \ in any half-plane of the set (4.4), from that solution 
to which the formulas (4.8), (4.9) apply. If the elements aj:(A, v), 4=1, 2; 
1=1, 2, 3, 4, are defined, then, by the formulas 


= WA, »)PO, d), 


5.9 
' 442) = BWA, »)B(1, d), 


it is found that 

(5.10) DA, v) = (Gig + 54207). 

The determinant D(A, v) is accordingly given by the formula 
(5.11) DQ, v) = { Ay — + — Aer}, 


in which, if, a is interpreted as being identical with au, 


(5.12) AQ, ») =| b= 1,2, 3,4 
Gt 2,141 

Since the matrices 9)(x, 4) which enter into these formulas are different 
for \ in different half-planes (4.4), the elements a;;(A, v), and the determinants 
A,(A, v) are also different functions for such different values of \. However, 
each solution 9)(x, \) in question is asymptotically described as it is used by 
the relation (4.9), and in this relation the matrices on the right are specific 
and independent of X. It follows that each element a;;(A, v), and likewise each 
determinant (5.12), may be taken as asymptotically equivalent to a respec- 
tive formal power series in 1/A, and thus as subject to a single representation 
for all large values of X. 

It is useful, for the exploitation of certain symmetries to extend the 
definitions of the elements a;;(A, v) and of the determinants A/,(A, v) to all 
indices 1. This may be done by the conventions 


= ily 


(5.13) 
Ai, = Any for 1, = 1, (mod 4). 


(March 


1943] DIFFERENTIAL BOUNDARY PROBLEMS 


If the constants V; are then defined, thus, for all integers h 
T 


it is easily verifiable that the formula (5.11) may be written in the form 
m+3 


(5.15) D(A, v) = (— 
l=m 


with any choice whatever of the integer m. The characteristic values other 
than zero, and hence all those which are numerically large, are thus roots of 
the equation 


(5 .16) A + A A = 0. 


6. On the regularity or irregularity of a boundary problem. For any given 
value of v, any specific coefficient A (A, v) of the equation (5.16) is a function 
of X. This function either vanishes identically or is asymptotically represent- 
able by a series in powers of 1/A, with a constant term that may appropriately 
be designated by the symbol A,(@, v). The coefficient A;(A, v) in question 
will be said to be regular or irregular at the given value of v, according as its 
constant term A;,( ©, v) is different from zero or vanishes. 

If a given boundary problem is one that fulfills the hypothesis 1 in the 
manner of the configuration 1 of §2, the points AV; for any four successive 
indices /, mark the vertices of a parallelogram centered at the origin in the 
complex A-plane. The abscissas of these vertices evidently determine the 
magnitudes of the respective exponentials in the left-hand member of the 
characteristic equation (5.16), and that exponential which is associated with 
the vertex furthest to the right is the dominant one. Inasmuch as the orien- 
tation of the parallelogram is a function of arg A, and any specific vertex is 
furthest to the right for some values of arg A, the exponentials in the equation 
(5.16) may in this case all be characterized as in an obvious sense, potentially 
dominant. 

If the boundary problem fulfills the hypothesis 1 in the manner of the 
configuration 2, on the other hand, the points AV; are collinear, and lie upon 
the segment terminated at AVi, and AV; if the formulas (2.9a) apply, and on 
the segment terminated at \ V2 and A Vy if the formulas (2.9b) are applicable. 
Since a position furthest to the right is impossible for all but the end points of 
the segment, only two of the exponentials which appear in the equation (5.16) 
are in this case potentially dominant. 

The type of the boundary problem is , essentially determined by those 
coefficients A,(A, v) that are associated with potentially dominant exponen- 
tials in the characteristic equation. If the coefficients of the potentially 
dominant exponentials are all regular, the boundary problem itself is said 
to be of the regular type. If at least one coefficient of a potentially dominant 


303 


304 R. E. LANGER . [March 


exponential is irregular, but no one vanishes identically, the boundary ‘prob- 
lem is said to be of the mildly irregular type. Finally, if in the equation (5.16) 
at least one coefficient of a potentially dominant exponential is identically 
zero, but at least two coefficients of the equation are not identically zero, 
the boundary problem is said to be of the highly irregular type. 

It will be observed at once that this classification fails to account for such 
boundary problems as have characteristic equations with less than two non- 
vanishing terms. Such problems, however, have no expansion theories associ- 
ated with them. For, if in the equation (5.16) just one term is non-vanishing, 
the number of characteristic values is clearly finite. On the other hand, if 
every coefficient vanishes, the equation (5.16) is evidently vacuous. The 
boundary problem is then compatible for all values of A, and no characteristic 
values are distinguished. 

The classification thus described applies in particular to the boundary 
problem (2.3) which was originally given, and which is identified in the 
family by the parameter value y=0. The discussion at hand is concerned 
wholly with the case in which that problem is highly irregular. Of those 
coefficients A (A, 0) which multiply potentially dominant exponentials in the 
characteristic equation, at least one is therefore to be taken as identically 
zero. It will be supposed, primarily for the purpose of delimiting these deduc- 
tions to their present bounds, that those coefficients which do not vanish 
identically are regular. Although this is in fact a restrictive hypothesis, 
inasmuch as the case in which some non-vanishing coefficients are irregular 
is a more general one, the features which are engendered by such irregularities 
are, from the standpoint here to be maintained only secondarily germane. 
They constitute in the first instance the salient source of the distinctions 
between the regular and the mildly irregular cases. 


HypotTHeEsis 2. The given boundary problem is one for which at least one 
coefficient of a potentially dominant exponential in the characteristic equation 
vanishes identically, and for which the non-vanishing coefficients are regular and 
at least two in number. 


7. Specifications upon the family of boundary problems. Inasmuch as the 
constants x; introduced in §5 have remained unspecified, the boundary prob- 
lem of the family associated with any value of v different from zero has been 
only formally defined, and its type, in particular, has remained indeterminate. 
This is now to be made specific. From the formulas (4.9), (5.4) and (5.9), 
the evaluations 


v) 0) + Ki, i= 1, 2;l= 1, 2, 3, 4, 


are obtained. Through the relation (5.12), therefore, the expressions A ;(@, v) 
are formally quadratic polynomials in v in which the coefficients of the linear 
and quadratic terms are functions of the constants x,. It is now to be stipu- 


1943] DIFFERENTIAL BOUNDARY PROBLEMS 305 


lated that these constants be chosen so that the coefficient of each quadratic 
term vanishes. More precisely: The constants xi, i=1, 2;1=1, 2, 3, 4, shall 
be such that each expression A,(, v) ts a linear polynomial in v, and not 
identically zero. 

That this specification is not impossible of fulfillment in any case, may be 
established as follows, by the explicit display of a set of constants which 
have the requisite properties. Under the hypothesis 2, there exists an index p 
which is such that 


(7.1) A,(A, 0) = 0, A pii(, 0) 0, 
and A »42( ©, 0), Ap43(, 0) are not both zero. With such an index p let the 
constants x;; be taken thus: 
4;,p42(, 0), Ki,pt+1 = 0, Ki,pt2 = — Gip(, 0), Ki,p+s = 0, 
1, 2. 


It is readily computed that with these constants 
A,(©,v) = 0), 
A pti(®, = 0), 
Ap i2(%, v) = 0) + 0), 
A p+3(%, v) = 0) + vA 0), 


and since each of these expressions is a linear polynomial in v with at least 
one nonzero coefficient, they evidently all have the structure prescribed. 

With the coefficients A (A, v) thus constructed, it is evidently possible to 
determine in the complex v-plane a closed neighborhood of the origin within 
which they are all regular except possibly at y=0. Such a neighborhood will 
be referred to as a proper region for v, and henceforth it shall be understood 
that all values of vy that are brought into question lie in such a region. With 
the parameter so delimited the family of boundary problems is now such that 
each of its members associated with a value of v different from zero is of the 
regular type, and only the originally given problem is irregular. In an evident 
sense, therefore, the given highly irregular boundary problem has been im- 
bedded in a continuous aggregate of regular problems, and appears as 
analytically approachable through this aggregate by the medium of a passage 
of the parameter to the limiting value zero. The continuing discussion is 
almost exclusively concerned with considerations centering upon such an 
approach. Inasmuch as it is adequate to the ends sought to restrict the con- 
siderations to modes of approach in which arg v is bounded, that restriction 
is to be understood henceforward. 

8. Two sub-categories of highly irregular boundary problems. Rouché’s 
theorem. The method by which a theory for boundary problems of the highly 
irregular type is thus to be deduced, depends essentially upon the establish- 


306 R. E. LANGER  .- [March 


ment of a one to one correspondence between the characteristic values and 
solutions of the given problem with those of the regular problems of the 
imbedding family, and the consequent expression of the former as limits of 
the latter as y->0. The existence of these limits as finite values is, therefore, 
obviously a primary requisite, and since they may or may not all exist, de- 
pending upon the individual problem at hand, a partition of the entire 
category of highly irregular boundary problems into sub-categories is called 
for. These will be distinguished by the designations A and B. Problems in 
which the limits in question do all exist will be allocated to the sub-category 
A, and to them the theory under deduction will be applicable. All highly 
irregular boundary problems of the second order for which any analyses at 
all are at present extant belong to this sub-category. On the other hand, 
problems in which some of the limits fail to exist will be allocated to the sub- 
category B. To them the theory will have no application, and it seems im- 
probable that problems of this type admit of any expansion theory of a 
customary sort. 

A familiar theorem(*), upon which many of the considerations which 
follow are to be based, may be stated thus: 

If within and on any specific closed contour of the complex A-plane, two 
functions ¢(A) and y(A) are each analytic, and if on this contour the relation 


(8.1) | va) |<] ¢@)], 

maintains, the equation 

(8.2) $(A) + ¥(A) = 0, 

has precisely as many roots within the contour as has the equation 
(8.3) (A) = 0. 

For future reference it will be noted here, that due to the manner in 
which the parameter v enters into the structure of the functions A,(A, v), the 
following may be stated. 

If A,(A,0) =0, then 

Axa, v) = {Bi + »)}, with 6; 0. 
If A,(A,0) #0, then 

Ai(r, v) = { ax + Bw + mA, v)}, with a; + Bw # 0. 
In either case 7,(A, v) designates a function that is asymptotically represent- 


able by a series in powers of 1/A with a vanishing constant term, and other- 
wise with coefficients that are polynomials in v. It is evident, therefore, that 


the relation 


(8.6) lim 7:(A, ») = 0, 


(8) Rouché’s theorem. Cf. E. C. Titchmarsh, The theory of functions, Oxford, 1932, p. 116. 


(8.4) 


(8.5) 


a 


1943) DIFFERENTIAL BOUNDARY PROBLEMS . 307 


maintains uniformly as to v. The coefficients a; and 8; are in every case con- 
stants. 
CHAPTER 3 


BOUNDARY PROBLEMS OF THE SUB-CATEGORY B 


9. Problems of the configuration 2. In the case of any boundary problem 
which fulfills the hypothesis 1 in the manner of the configuration 2 of §2, 
the ratio of the constants I and I: is real. It will be shown that all such 
problems are to be allocated to the sub-category B. If the case in hand is one 
to which the formulas (2.9a) are applicable, the constant y which fulfills the 
relation I; =I, is positive and greater than 1. The potentially dominant 
exponentials in the characteristic equation (5.16) are those in which the sub- 
scripts are odd, and at v=0 the coefficient of at least one of these is zero. 
Let p be chosen so that this coefficient is A ,(A, v). It is found then, that after 
division by the leading exponential the equation (5.16) may be written in 
the form 


(9.1) Ap (A, ») — + — A = 0,. 


with s=1, and with T'=I, or l= —I: according as p=1, or p=3. If, alter- 
natively, the problem given is one to which the formulas (2.9b) apply the 
value of y which fulfills the relation T; = —7I: is positive and at least equal 
to 1. In this case the potentially dominant exponentials are those with even 
subscripts, and if A,(A, v) is taken as the coefficient of such a one and as 
vanishing at y=0, it is found that the equation (5.16) is again expressible in 
the form (9.1), in this instance with s= —1, and with =I; or l= —T; ac- 
cording as p=2, or p=4. The problems of the configuration 2 may, there- 
fore, all be analyzed by a consideration of the equation (9.1). 

The case in which y=1 may be readily disposed of. The equation (9.1) 
is then quadratic in &T, and as A,(A, v) tends to the limit zero with v some 
roots &F and hence some characteristic values A, become infinite. In the 
further considerations, in which it may now be assumed that y >1, it is con- 
venient to analyze separately the cases in which A p43,(A, v) does not vanish 
with v, and that in which it does. 

If the formulas (8.5) apply when ]=p+3s, then since (8.4) applies when 
l1=>p the characteristic equation (9.1) is expressible in the form (8.2) with 


Ap+3e 
¢= 6, -—“ ar, 
v 


y= np(A, v) (1/v)OF { v) +A ~ p+2e(A, vert}, 


The roots of the equation (8.3) are located at the points %, given for integral 
values of m by the formula 


308 R. E. LANGER 
(9.2) (1/1) + log 


If 6 is any positive constant such that 8|T,| <7, these roots are enclosed 
individually by the circles of the nonoverlapping set 


(9.3) AA] = 4, 


and it is seen at once that on any such circle 
(1/r)eT = 


and that as 


©, and 


Since by the first of these relations 
= — eT), 


on any circle (9.3), there clearly exists a positive constant M which is inde- 
pendent of v, and such that for A on the circles the relation | P(A) >M main- 
tains. But it is also clear from the evaluations given, that Iya, v)| <M when- 
ever || is sufficiently small. For all such values of v, therefore, the condition 
(8.1) is fulfilled, and it follows that each circle contains a root of the equation 
(8.2), namely contains a characteristic value. It is evident from (9.2), how- 
ever, that each of the circles (9.3) recedes to infinity as y-0. The enclosed 
characteristic values therefore approach no finite limits. 

If with /=p+-3s the formulas (8.4) apply, then since they also apply with 
l=», it follows under the hypothesis 2 that the function A »,,(A, v) is given 
by the formula (8.5). In this case the characteristic equation (9.1) may be 


written in the form (8.2) with 
= By — 
= ») — (1/r)erT + — OT} 
OT {Boiss + nprse(A, v)}. 
The roots of the equation (8.3) are now located at the points 
= (1/40) + log ( 
Apis 
and with this interpretation of \%, these roots are again enclosed in the circles 
(9.3). On these circles it is seen that 
(1/r)ePT = (Bp/apy.)er™, 
= B,(1 — eT), 


and that as v0 


1943] DIFFERENTIAL BOUNDARY PROBLEMS 
and 


By precisely the reasoning of the previous case it is seen that each circle en- 
closes a characteristic value and carries it to infinity as y-0. 

The assertion that all boundary problems which conform to the con- 
figuration 2 are of the sub-category B, has thus been substantiated. 

10. Problems for which two consecutive coefficients of the characteristic 
equation vanish. Since by virtue of the results of the preceding section, the 
continuing discussion is concerned only with boundary problems which con- 
form to the configuration 1, the points V;,/=1, 2, 3, 4, in the complex plane 
mark the vertices of an actual parallelogram, and each exponential in the 
characteristic equation is potentially dominant. Let the interior angle of this 
parallelogram at the vertex V; be designated by w;. The cases upon which 
the attention is to be focused in this section, are those in which for some 
index / the two consecutive coefficients, A7(A, v) and Az4:(A, v) vanish with v. 
Since the respective angles w; and w;4: are adjacent angles of the parallelo- 
gram, one of them at least does not exceed a right angle, and if this one is 
designated by w,, the index » is thereby fixed to be such that 


A,(A, 0) = 0, @, 
and also such that with either s=1 or s= —1, as the case may be, 
Apra(Ay 0) = 0. 


With this determination of p the formulas (8.4) apply when /=), p+s, and 
the formulas (8.5) do so when 1/=p+2s, p—s. After division by v and by the 
leading exponential, the characteristic equation (5.16) is accordingly ex- 
pressible in the form (8.2) with 
= By — 
(10.1) mp(A, ») — {Bore + ») fed 


In this instance the roots of the equation (8.3) lie at the points 


1 By 
10.2 i+1 ( 
( ) mnt + log 


With this interpretation of \%, these roots are enclosed in the circles of the 
set (9.3). On these circles 


(10.3) = B» ar 


and hence 
(10.4) = = eV | 


310 R. E. LANGER ° [March 


The quantity on the left of the relation (10.3) is thus seen to be bounded 
uniformly as to v and m, and the existence of a constant M which is inde- 
pendent of »v and which is such that |¢| > M for all d on the circles, is evident. 

Consider now those circles of the set above which are associated with the 
values of m for which sm is positive. On these circles 


V 
— Vo] = { 2mxi + log | v|} 


(10.5) 1 ( B 
+ [Voie — Vp} 4 4A + lo 
2 Vere — Vp v| 
With the evaluation 


V V 
(10.6) arg = SwWy, 
V> 


the real part of the first term on the right of the formula (10.5) is found to be 


V pts | 1 
- sin w, + cos w,:log . 
— V> »| 
This becomes negatively infinite as »-0, and since the remaining term on 
the right of the formula (10.5) is bounded, it follows that 


OW 0, 


Since for \ on the circles in question \A—,9 ©, the inequality | ~| <M main- 
tains for all values of v that are sufficiently small. For such v, then, each of 
these circles contains a characteristic value, and these values become infinite 
as v0. Any boundary problem for which two consecutive coefficients of the 
characteristic equation vanish must, therefore, be allocated to the sub- 
category B. 

11. A third type of problem of the sub-category B. If for the given bound- 
ary problem the index p is determined so that 


(11.1) A,(d, 0) = 0, 
it may be assumed in this continuing discussion that 
(11.2) 0) # 0, A p+i(A, 0) 0, 


since the alternative has been disposed of in §10. The characteristic equation 
(5.16), after division by v and by the leading exponential, may, therefore, be 
written in the form (8.2), in which the function ¢ is as given by the formula 
(10.1). The corresponding function y is then expressible in the form 


= np(d, ») — {By + ») 


(11.3)” 


1943] DIFFERENTIAL BOUNDARY PROBLEMS 311 


and in these formulas the index s may be taken to be either 1 or —1. As in 
the preceding section, the roots of the equation (8.3) are given by the for- 
mulas (10.2), and are thus enclosed in the respective circles (9.3). Let the 
attention be directed upon those circles of this set that are associated with 
indices m for which sm exceeds a certain positive value to be further deter- 
mined below, and let \ be considered upon these circles. The evaluations 
(10.3) and (10.4) then maintain, and for some positive constant M which is 
independent of », the function ¢ fulfills the relation |¢| > M. 
From the formulas (9.3) and (10.3) the equality 


, 1 | + log | v|} 


V V s AX ] 
+ [ pt + 


may be verified. In the right-hand member of this the second term is inde- 
pendent of m and is bounded as to v, whereas the first term has‘a real part 
which may be computed, with the use of (10.6), to be 


2 V i 
(11.4) — Vol Vore— Vo| sin wp 


— = | 


+ [| V,| | Voie — V,| cos w,| log | v|}. 
If the constants V; and w, involved in this are such that 
(11.5) | — Vp| —| Vere — cos wy > 0, s=1,—1, 


it is clear that the value (11.4) becomes infinite as y-0. The cases contrary 
to this are those which are here to be specifically considered. 

If the boundary problem under discussion is one for which the relation 
(11.5) is not fulfilled, either when s=1 or when s=—1, then for such s the 
quantity within the brace in (11.4) is arbitrarily large when sm is sufficiently 
large, and the absolute value of the exponential 


eh VptsVp-s) 


is accordingly arbitrarily small uniformly in v. It may be seen, therefore, 
from (11.3) and (10.3) that for al! values of vy such that | >| is suitably small 
and for the values of m such that sm exceeds a value appropriately large, the 


relation 
<M 


is fulfilled. With the condition (8.1) thus met, each of the circles in question 
contains a characteristic value, and retains it in its interior as v0. Inasmuch 
as the circles recede to infinity as y->0, it is clear that any boundary problem 


312 R. E. LANGER © [March 


which does not satisfy both of the relations (11.5) must be allocated to the 
sub-category B. 

The failure of either one of the relations (11.5) admits of a simple geo- 
metrical interpretation. Relative to tke parallelogram with vertices at the 
points V;, /=1, 2, 3, 4, in the complex plane, the symbols w,, | Vo.— V,| ' 
| Vewe— V,I, respectively designate the angle at the vertex V, and the 
lengths of the adjacent sides. If w, is a right angle or an obtuse angle, no 
failure of the condition (11.5) is possible. However, if w, is acute a failure is 
possible and is articulate of the fact that one of the sides of the parallelogram 
adjacent to the vertex V, is exceeded in length by the projection of the 
other one upon it. It will be seen at once that in such a case the diagonal 
V1 V p41 divides the parallelogram into two triangles each of which has at 
one of the vertices Vp1, Vp41 an angle that is not acute. The boundary prob- 
lems associated with such a configuration are, therefore, those which are 
allocated in this section to the sub-category B. 

As has been remarked above. the theory under deduction will have no 
application to boundary problems assignable to the sub-category B. A hy- 
pothesis to disbar such problems from further consideration is, therefore, 
called for. To facilitate its enunciation, among other things, it is convenient 
to adopt here the relations 


(11.6) = (- 1)*+T;, Ten = (— h= 0, = 1, = 2, 


which extend the definitions of I’; to all indices /. It will be noted that under 


them 
T, if = (mod 4), 


and from the relations (5.14) that 

(11.7) T, = Vina — Vin. 

The vector I’; thus represents the /th side of the parallelogram with vertices 
at the points V;, and for all / 

(11.8) arg — arg = 

The characteristic equation may accordingly be written, with any choice of 1, 
in the form 

(11.9) Aa(A, v) — Argald, + THT) — Ap, 3(A, = 0, 


Since the values | | Vol, are now symbolizable by 
|T,-a|, |I',], and hence have in some order the values |T',|, |T'| the condi- 
tions (11.5) may evidently be expressed in the form (iii) below. 


HypoTuHEsis 3. The given boundary problem is one whose characteristic 
equation, when written in the form (11.9) with v=0, fulfills the specifications: 
(i) the ratio T/T: is not real; and 


1943] DIFFERENTIAL BOUNDARY PROBLEMS 


(ii) if Ap(A, 0) =0 then 0) #0, and 


CHAPTER 4 
THE CHARACTERISTIC VALUES 


12. The characteristic values for restricted values of v. In the complex 
\-plane the relations 


— — arg (T: — S argd < — 2/2 — arg + Tr), 


(12.1) ial ey, 


define, for each index / and for any non-negative real constant N, a region 
which is to be denoted by S:{N). Any four consecutive regions of this set 
cover the part of the plane which lies outside of the circle with radius NV 
centered at the origin. The asymptotic distribution of the characteristic 
values may therefore be determined by a study of their distribution in the 
region (12.1) with the index / unspecified. 

The characteristic equation of the family of boundary problems has been 
written in the form (11.9). If the abbreviations 


(12.2) cv) = Ai(~, »), 
are adopted, the further relations 
(12.3) Ax(d, ») {1 + xi, »)}, 


define the functions x,(A, v) here involved, and these are evidently all arbi- 
trarily small, uniformly as to vy in any proper v-region, when lies in a region 
(12.1) in which JN is sufficiently large. 

Under the relation (2.8), which maintains in all boundary problems now 
under consideration, the constants of the set 


Ty + 
Tas 


1 = 1, 2, 3, 4, 


sin arg { 


are all positive. Let a designate the smallest one of these constants, and let 
a; be any positive constant less than a. It is then easily verified that the rela- 
tion 

(12.4) | arn| < for in Si(N), 


maintains, and that when » is restricted to a part of its proper region in 
which 


(12.5) Sey, 


then the values 


313 
| 


314 R. E. LANGER 


1 
(12.6) arin |, 
C14 i(¥) 


are arbitrarily small in S,(N), if N is sufficiently large. 
With the symbol r standing in the place of either 0 or 1, consider the 
functions (A, v) and W(A, v) defined by the formulas 
o(A, vy) = 1— AT: + 
Ci41(v) 


ci(v) 


v) = (1 — 2r) {6 


in which 
A r, 
Oo(A, v) = xi(A, v) — AusQy ») ATin, 
ci(v) 


A v) 


A1(A, v) = x141(A, v) 


If \* is taken to designate any zero of the function ¢(A, v) in the region 
S,(N), the equation ¢(A*, v) =0 may be written in the form 

1+ v) 


(12.7) 
ci(v) 1 + 76,(A*, v) 


On the circle 
(12.8) = + Ad, =.«, 


in which ¢ is positive and less than the smaller of the numbers x/|T’;|, 7=1, 2, 
but otherwise arbitrary, the relation (12.7) yields the evaluation 


ar; = 1+ v) 
ci(v) 1 + 76,(A*, v) 


and since 69 and @; are arbitrarily small over the region S;,(N), it follows that 
on the circle (12.8) the function ¢(A, v) differs by arbitrarily little from the 
value (1 —e™*), while the function (A, v) is arbitrarily small. Since a relation 
(8.1) thus maintains upon the circle (12.8), the equations (8.2) and (8.3) have 


the same numbers of roots within it. 
Now when r=1 the equation (8.3) is the characteristic equation and (8.2) 


is the equation 

141(¥) ar 
ci(v) 


whereas when r=0 the roles of these two equations are reversed. It follows 


(12.9) 0, 


{March 


1943) DIFFERENTIAL BOUNDARY PROBLEMS 


that the roots 
ci(v) 


of the equation (12.9) which lie in the region S,(N), may be set into one to 
one correspondence with the characteristic values in that region, with corre- 
sponding elements within a distance € of each other. Since the points Aj, are 
spaced at distances exceeding 2¢ from each other, it must be concluded that 
the characteristic values in the region S;(N) are all simple, and that they are 
enumerable and denotable in the manner X;,m so that 


(12.10) Nim = (1/0) — log 


(12.11) | Arm — <<, for in S,(N). 


Inasmuch as the constant € may be taken to be arbitrarily small, and the 
relation (12.11) is nevertheless fulfilled when N is sufficiently large, the use of 
the symbolism of asymptotic representation, namely 


(12.12) Nim ~ 

is evidently justified. The entire set of characteristic values is clearly enumer- 
able, since those which lie within any circle of radius N centered at the origin 
are finite in number, while those outside such a circle stand in correspondence 
with the enumerable sets (12.10) with /=1, 2, 3, 4. 

The relation (12.5) restricts the parameter vy from a neighborhood of the 
origin y=0. This prohibited neighborhood can, however, be made arbitrarily 
small by the choice of a value of N that is sufficiently large. Irrespective of 
how small the proper region to which + is initially confined may be, therefore, 
the considerations above are applicable for a range of values of v that is not 
empty when the characteristic values concerned are remote enough from the 
origin of the \-plane. Now for v in a suitably small region, a comparison of 
the formulas (12.2) with (8.4) and (8.5) shows that when a;#0 then the 
difference {log ci(v) —log a} is arbitrarily small, whereas when a;=0 then 
log c:(v) =log (8w). It may be drawn from the relations (12.11), therefore, 
that 


(12.13) < Be, for Arm in Si(N), 


where 


(12.14) (1/21) — (— 1)' log j= 0,1, 


Kv 


when b is any index for which A;(A, 0) =0, and 


aq 


315 
| 
(12.15) — (— 1)’ log j= 0,1, | 
|_| 
| 


316 R. E. LANGER - [March 


when g is an index for which A,1(A, 0) 40, A,(A, 0) 40, and Agui(A, 0) 40. 
It will be observed that the points (12.15) are constant as to v, and hence that 
any characteristic values represented by them through the relation (12.13) 
are asymptotically constant. Under the hypotheses made, however, at most 
one index g can exist, and there may be no such index at all for the boundary 
problem under consideration. In at least two and possibly in all four of the 
regions S;(N),/=1, 2, 3, 4, the characteristic values accordingly refer through 
the relations (12.13) to points of the respective sets (12.14), and so depend 
in an essential manner upon ». 

It will be observed for later reference that insofar as an index k is con- 
cerned to which the formulas (12.14) apply, the reasoning epitomized in the 
relations (12.13) would be in no way affected if the function Ax42(A, v) were 
replaced by 0, and the functions A,(A, v) for /=k—1, k, k+1, were replaced 
by their leading terms as those are given in the formulas (8.4) and (8.5). These 
replacements substitute the equation 


(12.16) Bev — — = 0, 


in the place of the characteristic equation. In the regions S,,(N) and S;(J), 
therefore, the roots of this equation are also represented asymptotically by 
the points of the sets (12.14). 

13. On critical values of \ and v. By virtue of the hypothesis 3 the 
boundary problem at hand is one for which the relations (11.2) maintain if 
the index p is suitably determined. Let such a determination of p be fixed 
upon, and throughout this section let it be understood that k is used to stand 
at will for either p or p+ 2. For these values of k, the equations (12.16) are 
to be considered in the respective \ half-planes S:-1,, of which each consists 
of the pair of adjacent sectors S,-1(0) and S;(0). 

If, for any value of v, the equation (12.16) admits of a multiple root in the 
half-plane S;-1,., that root is a zero of the derived function 


It is, therefore, a point of the set 


1 
(13.2) (2m + 1)ri + log 
— Te 


with the integer m, such that it lies in the region in question. Upon substitu- 
tion of the values (13.2) into the equation (12.16), the respectively corre- 
sponding values of v are found to be given by the formulas 


(13.3) = Hye-m®, k= p,p+2, 


in which each coefficient H; is a (complex) constant independent of m, 
whereas 


1943] DIFFERENTIAL BOUNDARY PROBLEMS 


(13.4) 2= 


The equation (12.16) obviously defines v as a single-valued analytic func- 
tion of X. For the deductions at hand, however, the inverse relationship, 
namely the dependence of A upon » is of more immediate consequence. In the 
map defining this dependence the points (13.3) are branch points. According 
as the domain of v includes these points or excludes them, the equation (12.16) 
may be regarded as defining A(v) as an infinitely many-valued function, or as 
defining its infinity of roots as distinct single-valued functions of v. Of these 
alternatives the latter one is to be adopted, and the points (13.2) and (13.3) 
are to be referred to henceforth as critical values of X and »v, respectively. It is 
to be shown, among other things, that there exist in the domain of v paths of 
approach to the origin which avoid the critical values, and in fact that there 
exist such paths along which |»| varies monotonically and arg v varies within 
an arbitrarily prescribed positive range, and along which the roots of the 
equation (12.16) are uniformly bounded from the critical values of X. 

The formula (13.4), together with the hypothesis 3, assures that both the 
real and the pure imaginary components of the constant Q are positive. For 
k=p and for k=p+2, therefore, the points of the respective set (13.3) lie 
upon a logarithmic spiral which winds in upon the point »=0, the points 
given by successive indices m being spaced along this spiral at regular angular 
intervals of magnitude equal to the imaginary part of 2. Let x; be defined as 
the smallest positive constant of the set 


r 


@ 


The relation 
plkem) — ylar) 


(13.5) 2 


p(@r) 


maintains then for every pair of distinct critical values v*™ and v‘¢~), irre- 
spective of whether they lie upon the same or different spirals. 
Together with a prescription of continuity at z=0, the formula 


— 1) — — 1) 
(T p41 —T,)z* 


defines F(z) as a function of z which is analytic over the finite z-plane. This 
function is, therefore, in particular bounded in the unit circle, and M may 
accordingly be chosen as a constant such that M21, and 


| F(z)| < M, for |z| < 1. 


F(z) = 


317 
ptl ~ +p 


318 R. E. LANGER [March 


With any prescribed value arg vo, and with any positive constant 6 that is 
exceeded by both of the constants + and x, let }>s designate the sector 


| arg » — arg vo| < 6/2. 
With such a value of 6 chosen, the relations 


sin (8/2) ) 1/2 
M 


define in the d-plane a set of circular regions with fixed radii, and centered at 
the critical points (13.2). Through the relation (12.16) these regions are 
mapped upon respective neighborhoods of the points v*»™, These will be re- 
ferred to briefly as critical neighborhoods. 

From the relations (12.16) and (13.2), it may be drawn without difficulty 


that 


(13.6) k=p,p+2;m=m, 


— 
2°F(z), with z= (— — 
v 


It follows from this that every value \ within a region (13.6) corresponds to a 
value of vy such that 


6 
<= sin (=). with 6; < 6, 


(13.7) 


namely, that the critical neighborhood of the point v*™ is wholly within the 
respective circle of the set (13.7). Since 5:<«, and 6:<z7, it is clear on the one 
hand, because of the relation (13.5), that no two of the circles (13.7) have any 
points in common, and on the other hand, directly from the formula (13.7), 
that no one of them includes the point y=0. Since each circle furthermore 
subtends at y=0 the angle 4;, which is Jess than the angle of the sector )-s, 
the following facts are easily verified. If from the sector }., all points which 
belong to any circle of the set (13.7) are deleted, the remainder of the sector 
is a connected region within which there exist continuous paths of approach 
to the vertex »y=0 along which |v] steadily decreases. This is what was to be 
shown. Since along such a path vy remains in the chosen sector, the oscillation 
of arg v does not exceed the prescribed value 6, and since v enters no circle 
(13.7), no root of the equation (12.16), either with k= or with k=p+2, 
enters into a region of the set (13.6). The roots of the equations (12.16) thus 
remain uniformly bounded from the critical \-values. Paths in the »-plane 
having the properties enumerated will be referred to henceforth as regular 
paths for v. 

It may be noted incidentally that the cases in which the imaginary com- 
ponent of the constant Q2/z is rational are peculiarly simple. From the for- 
mulas (13.3) it may be seen that the critical points y*™ then all lie upon a 


1943] DIFFERENTIAL BOUNDARY PROBLEMS 319 


finite number of rays from the origin of the v-plane. The circles (13.7) includ- 
ing the critical neighborhoods are therefore centered upon these lines, and any 
section > s accordingly includes rectilinear paths of approach to vy =0 that are 
regular. 

It is familiar, and can easily be proved, that any exponential sum all of 
whose zeros occur at points of the set (13.2) is uniformly bounded from zero if 
X is uniformly bounded from the points in question. Any quotient obtained 
by the division of a function (13.1) by one of the exponentials which it in- 
volves is such a sum, and is therefore bounded from zero when A remains 
outside of the circles (13.6). With any regular path for v there may, therefore, 
be associated a positive constant of p which is such that for all values of v 
upon the path the relations 


(13.8) | +4 pg | = p, 
k=?,~p+2;7 =0,1, 


are fulfilled by every root of an equation (12.16). 

14. The loci of the roots of an equation (12.16). As the parameter v varies 
along any regular path, the roots of the equation (12.16) with k= 4, in the 
respective half-plane S,-1,,, remain distinct and trace out continuous loci in 
the A-plane. It is to be shown that there exists for each of these loci a finite 
terminal point corresponding to the parameter value y=0, and hence that 
every root of the equation in question approaches a finite limit as »—>0. 

The change of variable and parameter from \ and v to 2(=x-+y) and yp, 
as given by the relations 


= (i/2) [Tp — + log 


Ap—1 
Tp — 


(14.1) 


pe” = B,y exp { 


with 420, transforms the equation (12.16) into the equation 
(14.2) 0 4 = y 
with 


r T 
(14.3) in 


= T p41 
If 29 indicates the point corresponding to A = 0, the region S-1,, is transformed 
into the half-plane 
(14.4) — «+ tan’ B/a S arg (z — 2) S tan B/a, 


and this includes all except possibly a finite segment of the positive axis of 
reals, 


320 R. E. LANGER ~ [March 


From the equality of the pure imaginary components of its two members, 
and the equality of their absolute values, the complex equation (14.2) may be 
made to yield the pair of real cartesian equations 


(a) e”sin {[1 + a]x + By + 6} = ¢" sin {[1 — a]x — By — 6}, 


(14.5) (b) 4e2#v{ cos? x + sinh? y} = prez, 


Since y is proportional to |v], it may be taken to fill the role of the parameter. 
Along any regular path, 8, which differs from arg v by a constant, is then 
determined as a function of y, the oscillation of @ being less than 6 for any 
path in asector ds. By virtue of the hypothesis 3, the real constants a, 8, 
which appear in the equations (14.5), and which are defined by the relation 
(14.3), are such that 


—-1l<a<il, 0< 8. 


It may be noted, however, that in the equations (14.5) an interchange of 
a and —a may be achieved by the substitution of —y and —@ in the place 
of y and @. Since any result derived for a>0 may, therefore, be translated to 
apply when a <0, there is no essential loss of generality in assuming for the 
explicit discussion that a20, and this will be done in the following. 

With any choice of an initial parameter value vo, which is such that for the 
associated value 0¢ the constant 


+ (x/2)a 
(1 —a)x 


is not an integer, it is possible to associate an integer m» such that for all real 
constants 59 which are numerically sufficiently small, the relations 


| (qe + 1/2)ma + + S00 | 


are fulfilled when s =0 with go=0. If the case is one in which a>0, there exist 
then a pair of positive increasing sequences of integers {q,} and {m,}, for 
which the relations (14.6) are fulfilled when s=1, 2, 3,---. It thereupon 
follows further, again if | 59| is sufficiently small, that the relations 


1/2 Oo +8 
(1+ 


are fulfilled by the integers of a third increasing sequence { po}. Let such 
sequences relative to the chosen constant 5o be fixed upon. If the case is one 
in which a=0, these sequences may be taken arbitrarily, since the relations 
(14.6), (14.7) imply no specifications for them. 

Consider the relations 


1943] DIFFERENTIAL BOUNDARY PROBLEMS 


(a) y20, (1+ a)x+ By = (p. + 1/2)4 — 00 + 
(pe + — + do 
i+ea 
(ge + 1/2)4 + + do 


and 


(b) y=0, xbetween 


(14.8) 


1—ea 
(c) (1 — a)x — By = (q. + 1/2)9 + + do. 


In the (x, y)-plane the first and third of these define half-lines which terminate 
upon the axis of x, and the second is the segment intercepted by them upon 
this axis. The set of relations as a whole therefore defines a broken line which 
divides the region (14.4) into two parts, in the one of which, to be denoted by 
Z,(5o), the abscissas are bounded above. It is clear that a region Z,(59) with a 
larger index includes any one with a smaller index, and that the bound upon 
the abscissas increases indefinitely with s. Let it be assumed now that the 
path of » lies in a sector )\s for which 6 fulfills the conditions imposed upon 
| 5o| above. It is to be shown for the equations (14.5) that every root (x, y) 
_which initially lies within any region Z,(6) remains within that region as u—0, 
and that any root which is initially outside of any region Z,(—6) remains 
outside. 

Consider any root in a position in which its ordinate is positive. For 
this position the equation (14.5a) shows that the sine function in the left- 
hand member of that equation is numerically less than unity, and hence that 
the value of { [1-+a]x+6y+6} is not an odd multiple of r/2. With a suitable 
determination of s, therefore, 


(14.9) (1 + a)ax + By < (p. + 1/2)4 — 00 + 4, 


where ~, is a member of the sequence so designated through the relation 
(14.7) in association with the value of 6).=5. The point (x, y), therefore, lies 
in the region Z,(5), and since the relation (14.9) maintains while y>0, it is 
clear that the root cannot issue from this region across the boundary (14.8a). 
Similarly with s properly redetermined and p, a member of the respective 
sequence associated with the value 59.= —6 through the relation (14.7) it is 
assured that 


(pe + — — 5 < (1 + a)x + By. 


The root (x, y) thus lies initially outside of the region Z,(—6) with this index 
s, and since the reasoning employed above shows it to remain outside so long 
as y>0, it is evident that no root may enter any such region across the portion 
(14.8a) of its boundary. 

If in any of its positions the ordinate of a root (x, y) is negative, the equa- 
tion (14.5a) shows that the value of { [1—a]x—By— 6} is not an odd multiple 
of 7/2, and hence that 


R. E. LANGER [March 


(gn + 1/2)" +0 — 8 < [1 — a]x — By < (qe + 1/2)9 + Oo + 8, 


provided s and s; are properly determined, and the sequences {qe} and { qe} 
are associated with the values 5.=6 and 59=—6 respectively, through the 
relations (14.6). Since this configuration maintains so long as y <0, it follows 
that no root may either issue from a region Z,(6) or enter into a region 
Z,(—8) across a boundary (14.8c). 

Finally, upon setting y=0 the equation (14.5a) is found to reduce to the 


form 
cos x sin (ax + 6) = 0. 


Of the roots of this equation those that are zeros of the factor cos x lie at 
points of the set 


(14.10) [(r + 1/2), 0], r=0,1,2,---. 


They are shown by the equation (14.5b) to be uniquely associated with the 
parameter value yu =0, and thus, as points of loci which are traced out as y—>0, 
they are terminal points, and not points at which the loci actually cross the 
axis of x. Such crossing points must accordingly be zeros of the factor 
sin (ax+6), and hence points at which the respective values of {ax+6o} 
differ from integral multiples of + by less than the amount | 5o| . If a>0 no 
such point lies on any segment (14.8b), either for 59=6 or for 59= —4, since 
the relations (14.6) and (14.7) insure that on any such segment 


new + | < ax + < (m+ —| do]. 


On the other hand if a=0 there exist no such points at all, as may be seen 
from the relation (14.6) with s=0. Since a root may, therefore, neither issue 
from a region Z,(5) nor enter into a region Z,(—4) over the boundary (14.8b), 
the assertion above has been substantiated. 

This deduction admits of two specific and pertinent conclusions. In the 
first instance, since every root remains within some region Z,(6), its abscissa 
is subject to some upper bound. By the equation (14.5b), therefore, it ap- 
proaches a limit as y—0, and this limit is a point of the set (14.10). In the 
second instance, since no root may enter into any region Z,(—4), it follows 
that the distance of any root from the point Zo is subject to a lower bound, 
and that this bound is arbitrarily large for any root which is sufficiently dis- 
tant at any specific value of yu. 

As reinterpreted into terms of the variables \ and v through the formulas 
(14.1), the results may be formulated thus. The roots \’ of the equation 
(12.16) with k= which lie in the half-plane S,_1,,, all approach finite limits 
as v0, and these limits are all points of the set 


(14.11) + + log 


T, = T p41 


322 


1943] DIFFERENTIAL BOUNDARY PROBLEMS 323 


Any such root \’ which at any specific value of v is sufficiently large in abso- 
lute value, remains arbitrarily large as »-0. 

15. The characteristic values as v0. In §12 it was deduced that if there 
exists for the given boundary problem an index gq, such that A,(A, 0) #0 for 
l=q—1, g, g+1, then the characteristic values which lie in the regions 
S,-1(N) and S,(N) are represented by the points of the sets (12.10) with 
l=q—1, g. Insofar as these deductions were concerned, the restriction (12.5) 
upon the parameter v was wholly dispensable, since the relation (12.4) insures 
that the values (12.6) are arbitrarily small when N is suitably large, without 
recourse to the restriction upon v. The representations of the characteristic 
values by the relations (12.13), (12.15) are, therefore, not only valid as stated 
in §12, but maintain as v0. 

The contrary is the case insofar as the characteristic values are concerned, 
which lie in any region S,1(V) or S,(NV) with an index p for which A,(A, 0) =0. 
That these values cannot be represented for unrestricted values of v by the 
points (12.14) through the relation (12.13), is, in fact, immediately evident, 
since these points recede to infinity as v0. For the deductions culminating 
in the relations (12.13), (12.14) the restriction (12.5) was, therefore, essential, 
and it accordingly remains to deduce for the characteristic values in any half- 
plane S,-1,, a representation which maintains as v0. 

For J in a region S,(N), let the function (A) be defined by the formula 


(15.1) = — apis — 
With \=X’+A), in which )’ is any root of the equation (12.16), an alterna- 
tive formula for this function is 
(A) = — b2(AA) } 


in which 
= + py } , 


1 — eT 


i- 


$3(AX) = 
p41 


If the parameter » lies on a regular path, the relation (13.8) with k=p and 
j=0, insures the existence of a positive constant p which is such that 
|di(r’)| >p for all choices of the root X’ and all positions of v on the path. 
The functions ¢2(AX) and @¢,(AX) are analytic for small values of AX, and 
vanish at A\=0. A positive constant ¢ may be determined, therefore, such 
that when | Ad| =e, then |¢2(AA)| Sp/2, and |¢,(Ad)| ¥0. It is clear, then, 
that on the circle 


(15.2) A=N+Ar, =e 


324 R. E. LANGER 


the function ¢(A) fulfills a relation 
(15.3) | o(a)| > M, 


with some positive constant M. 
Let the function Y(A) be defined by the formula 

(5.4) = {A, (A, ») — — {A ») — + 

On the circle (15.1) an alternative expression for W(A) is 


Ap+l 


WA) = A + 


Set) 


and from this it may be seen that 
(15.5) | <M, 


provided N is sufficiently large and |»| suitably small, since the exponential 
exp {AC p1-T,) } is bounded in the region S,(N), and the functions 
exp { A p41} and 7,(A, v), are then arbitrarily small. 

By the deductions of §12, both the characteristic values and the roots of 
the equation (12.16) in the region S,(V), are represented with an arbitrary 
degree of accuracy at any specific value of v (+0) by the set of points (12.14) 
with k=, j7=0, if the value of N is sufficiently large. Each characteristic 
value thus corresponds to and is represented in an obvious sense by the 
respective root \’. Since with the definitions (15.1) and (15.4) the equation 
(8.2) is the characteristic equation, it follows from the relations (15.3) and 
(15.5) by reasoning which is now familiar, that the circle (15.2) contains and 
retains a characteristic value within it, and therefore that \’ continues to 
represent its associated characteristic value, so long as it remains in the 
region S,(N). 

By the formal interchange of the symbols a@p-1, and with apy, 
Boy, and I'p4; respectively, the deductions given above may be adapted to 
the consideration of the characteristic values and roots \’ in the region 
Sp-i(N). Since it was found in §14, that any root \’ which at an initial value 
of v lies in the domain comprised of the regions S,1() and S,(N) remains 
in this domain as y»—>0, the asymptotic representability of the characteristic 
values in the half-plane S,1,, by means of the roots of the equation (12.16) 
as v0, has been established. In particular, therefore, every characteristic 
value Aim approaches a finite limit as »0, and |Aim| is subject to a lower 
bound, which is indefinitely iarge with the index m and is independent of the 
parameter pv. 


[March 


1943] DIFFERENTIAL BOUNDARY PROBLEMS 325 


Since as v0 along a regular path each circle (15.2) contains precisely 
one characteristic value, it is a particular consequence that all such values 
lying outside of some circle centered at \=0 are simple, and that multiplicity 
is accordingly possible at most in the instance of members of the finite set 
which lies within such a circle. The following consideration shows, therefore, 
that multiplicities of the characteristic values may be wholly obviated for 
values of v different from zero by an appropriate choice of the path of v. 

Within any circle about the origin, the determinant D(A, v) given by the 
formula (5.15), and D,(A, v), its partial derivative as to \, are analytic func- 
tions of \ and polynomials in v. Their v eliminant, therefore, has at most a 
finite number of zeros within the circle, and these zeros correspond through 
the characteristic equation (5.7) to the values of v for which a multiple char- 
acteristic value is possible. Inasmuch as these values of v are thus also finite 
in number they may,—except for y=0, if that is among them,—be avoided 
by the choice of the path of v. It will be assumed in the following that any 
path of » that is brought into question does avoid these points. The relation 


(15 .6) D,(A, v) ¥ 0, for »v = 0, 


is then fulfilled by every characteristic value. 


CHAPTER 5 


SEQUENCES OF CONTOURS IN THE A-PLANE 


16. An ordering of the characteristic values. Through their designation 
in the manner X;,, the characteristic values have been grouped into sub-sets 
which are distinguished by the respective index values /=1, 2, 3, 4. For the 
continuing discussion advantages no longer subsist in this, and these values 
may profitably be regarded hence forth as members of a single simple se- 
quence, in which the ordering is specifically such as will be described in the 


following. 
Let 6 be chosen as any positive constant which fulfills the relations 


16.1 6 


and let 6, thereupon designate the smaller one of the values 
(6/3) | j=i,2. 


For those indices / for which A;(A, 0)=0 the functions c,(v), given by the 
relations (12.2), are constant multiples of v, and hence if vo is an initial 
parameter value (different from zero) on any regular path that lies in a sector 
>a,» the relation 


(16.2) | arg ¢:(v) — arg ci(v) | < 43, 


326 R. E. LANGER ° [March 


is fulfilled along that path. For those indices / for which A,(A, 0) #0, on the 
other hand, the functions c;(v) approach non-vanishing limits as y0. There 
exists, therefore, a neighborhood of the origin in which the oscillations of the 
respective functions arg c;(v) remain less than 6, and hence if vo is chosen in 
such a neighborhood the relation (16.2) is fulfilled for all indices / when yr is 
on the path segment terminated by vo and the origin. It will be supposed in 
the following that v9 is so chosen, and that y varies on such a path segment. 
The relations 
(16.3) arg 

T, ci(v) 
then maintain for all indices h and 1. 

Let the characteristic values be ordered now into a simple sequence 


(16.4) r=1,2,3,---, 


with an ordering such that at vp their absolute values stand in a non-decreas- 
ing succession, that is, 


(16.5) | | S | |, r=1,2,3,---. 


Through the asymptotic relationship (12.12), which maintains at y=yo, this 
ordering evidently serves immediately to order also the corresponding points 
(12.10), at least insofar as those with sufficiently large indices m are con- 
cerned, into the sequence 


(16 .6) (r), r=,m+i1,n+2,---. 


Inasmuch as each member of this latter sequence is drawn from one of the 
four sub-sets (12.10), it is clear that any consecutive five of them must include 
at least two from some one of the sub-sets. To every sufficiently large index r, 
therefore, there corresponds some index pair (/, m) such that 


(16.7) | | — | AP(v0) | | | — | | 


In this relation the right-hand member differs by arbitrarily little from the 
value 27/ | r;| , whenever m is large enough, as may be seen from the formula 
(12.10). It follows in particular, from the relation (16.1), that the left-hand 
member of the inequality (16.7) exceeds the value 325 whenever r exceeds 
some specifiable value, and hence that for every such index r at least one of the 


differences 


| ArS-1+(¥0) | — | Ars-s(vo) |, j = 0, 1, 2, 3, 


exceeds 85. It may be asserted, therefore, that there exists an increasing se- 
quence of integers m of which no one exceeds its predecessor by more than 
four, and for each of which the relation 


1943] DIFFERENTIAL BOUNDARY PROBLEMS 


(16.8) | Ant-a(¥o) | — | An* (vo) | > 86, 


is valid. 

17. The sequence of contours C,. For each index m for which the relation 
(16.8) is fulfilled, let the circle in the \-plane which is centered at the origin 
and of the radius A,, with 


(17.1) An = (1/2) { | | + | |}, 
be designated as the contour C,,. It will be seen at once that 
| As*(v0) | < A, — 46, forr < n, 


(17.2) 
| A*(v0) | > An + 46, forr > n, 


and hence that at vy =v» no point of the sequence (16.6) lies within a distance 
of 46 from any one of the contours C,. It is to be shown that a succession of 
points v, may be chosen on the path », such that |» decreases monotonically 
to zero, and such that no characteristic value lies within a distance of 6 from 
the contour C, when v is between vo and v,, namely when » is on the “path 
segment” (vo, 

If x, y, xo, and yo are any real values, and z = x+y, 29 = x9+7yo the relation 


2 2 2 
| —| 20] = — 20) + (y — + 90), 
is an obvious one, which leads easily to the inequality 


| — 


(17.3) ||z| ptly— sl. 


|z| +| 20 
Let any member of the set (16.6) be chosen, and let the indices (J, m) be deter- 
mined so that this member is also given by the formula (12.10). Then with 
ci(v) 
C141(¥) 
1 { ci(v) 
y = + ar 


and xo=x(v0), yo=y(ve), the formula (12.10) yields |z|=|A*()|, 
= |A*(r0) |. 


At v=yo, the relations 


1 
log 
T 
(17.4) | 


1 ci(v) 
(17.5) log < (6/3)An, i= 1, 2, 3, 4, 


are all fulfilled for every sufficiently large index ». They evidently continue to 
be fulfilled as » varies from v9, so long as it remains subject to a condition 


327 


328 R. E. LANGER 


1/2 


(17 .6) log S aA, , 


v 
in which a is an appropriately determined positive constant. With each index 
nm concerned there may, therefore, be associated a point », on the path of v 
for which the equality in the relation (17.6) applies. The condition (17.5) is 
then clearly fulfilled when y is between v and »,, whereas the sequence |v,| 
converges monotonically to zero, as was asserted above. With v on the path 
segment (vo, v,), the formulas (17.5) and (16.3) show at once that 


| — < (5/3)An, 


and 
| — y0| < (28/3). 
The relation (17.3) accordingly yields the inequality 
(1/3)5An 
| | + | | 
and from this, together with the relations (17.2), it may be concluded that 


+ 28/3, 


(17.7) || | — | || < 


| A*(r) | < An — 26, forrsn, 


17.8 
athe | A#(v) | > An + 26, for r > n. 


By the deductions of §12 the relations (12.11), with any positive e, are 
fulfilled for all sufficiently large values of N, and maintain while v fulfills 
the respective condition (12.5). Since in these deductions the role of « may 
be taken by the constant 6 above, and since the corresponding role of N is 
then filled by any of the constants A,—26 in which z is sufficiently large, it 
follows that all characteristic values ,(v) which lie outside of the circle 
|A| =A, fulfill a relation 


(17.9) | — | < 4, 
and do so for all values of vy that satisfy the condition 
Inasmuch as the relation 
ai(A, — 28) > 


is fulfilled for all sufficiently large indices n, this specification upon v is implied 
by the condition (17.6). The relation (17.9) thus applies in particular over the 
path segment (vo, v,), and from this, together with the inequalities (17.8), it 
follows at once that on this path segment the relations 


[March 


1943] DIFFERENTIAL BOUNDARY PROBLEMS 329 


| Ae(v) | < An — 6, for r <n, 


(17.10) 
| | > An + 8, for r >n, 


maintain. 
No characteristic value comes within a distance of 6 of any contour C, 


associated with a sufficiently large index m, therefore, while v varies along its 
path from vo to vp. 

18. The determinant D(,, v) on the contour C,,. As it is expressed through 
the formula (5.15), the determinant D(A, v) is the sum of four terms with 
coefficients A;,(A, v) that are of the structure (12.2), (12.3). It is to be shown 
that when, with a sufficiently large index n, \ and » are respectively on the 
contour C,, and the path segment (vo, v,), the functions B,(A, v) defined by 
the formulas 


(18.1) B,(A, v) = {(— 1) (a } 


1 
D(a, v) 


are bounded uniformly as to m, namely that there exists some constant M, 
independent of m and /, such that for all indices / 


(18.2) | Bi(A, v)| < M, ford on C,, and on (v0, 
When =A,, with a sufficiently large index and is on (v9, v,), the values 
(18.3) | cx(v)er¥? |, 1 = 1, 2, 3, 4, 


all differ from zero. As X traces the circle C,, each of these values is in its 
turn the dominant one upon a respective arc of the circle. If this arc upon 
which the largest of the values (18.3) is that given by the index h, is denoted 
by C®, the relations 


evi 
(18.4) <1, for », on and (vo, 


cr(v) 
are fulfilled for all 7, and it is accordingly clear that on this arc the index h 
also marks the dominant one of the functions (18.1). The relations (18.2) will, 
therefore, be established if it is shown that there exists a constant M such 
that for every h 
(18.5) | B,(A, v) | < M, for v on (vo, »,) and A on 


From the relations (12.1), the arc C® is seen to lie partly in each of the 
sectors S, and S,-.. It consists, therefore, of two contiguous arcs which may 
be conveniently denoted by co, j=0, 1, and which lie in the respective 
regions S,;. On each of these arcs the inequality (18.5) may be established 
in the manner of the following. The formulas (18.1), (5.15) and (12.3) yield 
for the reciprocal of B,(A, v) the expression 


R. E. LANGER 


B,(a, ») Ch Ch 
(18.6) 
| 


Ch 


By the use of the relations (11.7) the final two terms in this may be written 
in the form 


A { { ATA A { 
e +1 _ — 1 


On the arc C®”, therefore, their sum is arbitrarily small, in virtue of the 
relation (18.4), and the fact that with A in the domain S;,(A,) and v on (v9, v_) 
the values (12.6) are arbitrarily small. The remaining terms on the right of 
the relation (18.6) are expressible in the manner 


C ati 
{1 on} + ~ 


Ch 
In this the first member is identical with the function 
(18.7) { 1 _ 


because of the formula (12.10), whereas the remaining member is again arbi- 
trarily small in virtue of the relation (18.4) and the fact that the functions 
x:(A, v) approach zero uniformly as |A| + . Since with vy on the path segment 
(vo, vx) and on the arc C®®, the value (A—As,)I's is bounded from the 
multiples of 27i, uniformly as to m, as was shown in §17, it follows that the 
function (18.7), and hence the entire right-hand member of the formula 
(18.6), is uniformly bounded from zero. Thus with a suitable constant M, the 
relation (18.5) is established insofar as the values of \ on the arcs C®® are 
concerned. 

For the discussion relative to the arcs C*” the reasoning above may be 
essentially adapted by the mere interchange of the roles of the third and fifth 
terms on the right of the formula (18.6). Thus the sum of the third and fourth 
terms, when written in the form 


A { on} +A { { ~ ant 
is seen to be arbitrarily small, since that is true of the expressions (12.6) with 


l=h—1. The remaining terms on the right of the formula (18.6) are expressi- 
ble in the form 


"1 
crerva 


{1 — + {x — 


330 [March 


1943] DIFFERENTIAL BOUNDARY PROBLEMS 331 


and this is uniformly bounded from zero, as was the value (18.7). The exist- 
ence of a constant M for which the inequalities (18.5), and hence (18.2), are 
valid, may thus be regarded as established. 


CHAPTER 6 


ADJOINT BOUNDARY PROBLEMS 
19. The definition and the solutions of adjoint boundary problems. In 
terms of the square matrices which appear as coefficients in the equations 
(5.2), the set of relations 
3'(x, v) = — 3(x, + A(x)}, 
(19.1) 3(0, v) = BA, »), 
a(1, = — »), 
may be looked upon as constituting a differential system for a pair of vectors 
b(v), and 4(x, v), of the form 
biv) = (b1(v), bav)), 


(19.2) a(x, v) = (z:(x, v), v)). 


With \ and » at any specific values, this system will be characterized as the 
adjoint of the respective differential system (5.2)(*). As in the case of this 
latter, \ and v are to be considered as complex scalar parameters. The vector 
b(v) will be referred to as the parametric vector, and of a pair of vectors (19.2) 


which together satisfy the equations of the system, the vector 3(x, v) will be 
called the solution(?*). 

Let 9(x, \), as heretofore, be any nonsingular solution of the matrix 
equation (4.1). The general solution of the differential equation of the system 
(19.1) is then given by the formula 


(19.3) a(x, v) = E(v)Y- (x, d), 


in which f(v) is an arbitrary vector independent of x. Upon substituting this 
form into the boundary relations of the system, the vector f(v) is found to be 
subject to the evaluations 

= by) BCA, »)VO, d), 

f(v) = — (A, v)Y(1, A). 

(*®) The comparative structure of the systems is somewhat better shown if the equations 
(5.2) are written in the form y(x, ») = {AR(x)+OQ(x) }y(x, v), BA, v)y(0, »)=—al), 
BA, »)v(1, ») = —a(r). 

(°) This formulation of the adjoint differential system differs in some relatively minor 
details from that given by the author in the paper: The boundary problem of an ordinary linear 
differential system in the complex domain, Trans. Amer. Math. Soc. vol. 46 (1939) p. 165. It is 
obtainable therefrom, however, by setting m=2, no=m=0, m=1, and 3° (x)= —3)(x) 


= 3(x, »). 


(19.4) 


| 


332 R. E. LANGER . [March 


The consistency of these evaluations, as may be seen by the elimination of 
f{(v), is contingent upon fulfillment of the condition 


(19.5) b(v) D(A, v) = 0, 


in which D(A, v) is precisely the matrix given by the formula (5.6). A solution 
of the system may, therefore, exist only in association with a parametric 
vector which satisfies the condition (19.5). Conversely, it is seen at once, 
every parametric vector which does satisfy this condition has a solution 
associated with it through the relations (19.4) and (19.3). 

The choice 6(v) =o obviously satisfies the equation (19.5). It is, however, 
uniquely associated with the solution 3(x, v) =o. This solution, which is thus 
always available, may properly be regarded as trivial, and to bar it from the 
further considerations the specification 


(19.6) b(v) ¥ 0, 


will be imposed. Under this condition the possibility of fulfilling the relation 
(19.5), and hence the existence of a solution 4(x, v), is contingent upon the 
values of \ and », and the differential system (19.1) may accordingly be re- 
garded as defining a family of boundary problems, precisely as such a family 
is defined by the system (5.2). The two families (19.1) and (5.2) will be de- 
fined to be adjoint. 

Under the restriction (19.6), the equation (19.5) is solvable if and only if 
X is a value for which the matrix D(A, v) is singular. Such values of X are 
accordingly to be designated as characteristic values of the boundary problem 
(19.1). Since, as roots of the equation (5.7), they have already been identified 
as characteristic values of the boundary problem (5.2), it must be concluded 
that adjoint boundary problems have the same characteristic values. That 
every such value is of the same index, namely admits the same number of 
linearly independent solutions, for each of the two boundary problems fol- 
lows also. For the numbers of linearly independent vectors c(v) and b“(v) 
which satisfy the respective equations 


= o, 
byD(A,, v) = 0, 


at the characteristic value A,, are, of course, either both one or both two, 
according as the rank of the matrix D(A,, v) is one or zero. If this rank is zero, 
it is clear that }, must be of multiplicity at least ‘two as a zero of the de- 
terminant D(A, v). The multiplicity of a characteristic value is, therefore, 
never exceeded by its index. 

The solutions which are associated with any vectors ¢(v) and b(v) ful- 
filling the relations (19.7), are given, respectively, by the formula 


(19.8) ») = Y(x, 


(19.7) 


1943] DIFFERENTIAL BOUNDARY PROBLEMS 


and by either one of the equivalent formulas 

a(x, v) = (r)BO(A,, v)Y(0, Ax) Ar), 

’ For the purposes of subsequent deductions certain pairings of these solutions 
are advantageous, and are definable whenever the characteristic value con- 


cerned is of an index equal to its multiplicity. 
Let d;; designate the elements of the matrix D, so that 


(19.10) D(A, v) = »)). 
If \, is a characteristic value of the index and multiplicity one, the relations 
v) 0, Dy(rry v) 0, 


maintain if the subscripts (k, /) are suitably chosen. The relations (19.7) are 
then in particular fulfilled by the vectors 


= ( ) 
¥) \— dar(de, ») 


(19.9) 


(19.11) 


= (— dai(Ar, v), ¥)), 


and neither of these is the zero vector. With the evaluations (19.11), the solu- 
tions (19.8), (19.9) will be said to be a normal pair. If X, is a characteristic 
value of index and multiplicity two, it may conventionally be regarded as the 
pair of coincident values A, and A,41. Since in this case the value 


DAA, ») 
(A — 


is not zero, whereas it is the determinant of the matrix 


19.12) ) = (tim 
(19. = (lim), 


this matrix is nonsingular. It may then be verified that the determinations 


1 
= = (— 1, 0), 
(19.13) 


0 


fulfill the relations (19.7), respectively for r=s and r=s+1. With each of 
them the respective solutions (19.8), (19.9) will also be said to be a normal 
pair. No normal pairing of solutions will be defined in the instance of charac- 
teristic values whose multiplicities and indices are not equal. 


333 | 


334 R. E. LANGER . [March 


20. The generalized relation of bi-orthogonality. Under the normalization 
of the given boundary problem in §3, and the construction of its imbedding 
family in §5, each element v,(A, v) of the matrices B(A, v), BMA, v) isa 
polynomial in \ of maximum degree 7;, and 7; =. It is readily seen in virtue 
of this, that the equations 


(20.1) BM A’, v) = BMA”, — (A” — BOP’, »), 
l=0 
implicitly define the matrices which are therein designated by »), 
and that any element v%"(X’, v) of such a matrix is a polynomial in \’ of 
degree at most r;—/—1 if 7; —1—120, and vanishes identically if 7; —/—1 <0. 
The relation 


{3(2)0'(x) + dx = — 


is an evident identity. If the vector 3 and the matrix 9) involved in it are 
taken respectively to be any solution 3?(x, v) and the matrix (x, A) which 
appears in the formula (19.8), the equations (19.1) and (4.1) may be used to 
give the resulting equality the form 


Ap) f V)R(x)Y(x, A)dx 
0 
+ {BW »)Y(0, + B(A,, »)Y(1, )} = 0. 


In this expression the matrices B“(A,, v) may be replaced by their equiva- 
lents as given by the formulas (20.1) with \’=A, and A” =X. The subsequent 
multiplication on the right by any one of the vectors (A—A,)~'e (v), it being 
assumed that \#A,, results then in giving the relation the form 


(20.2) 


fe, V)R(x)Y(x, A)e(v)dx 
0 


-1 
As AA, it follows from the formula (19. 8) and the analyticity of the matrix 
(x, A) as to A, that 
lim (x, A)e(v) = »). 


It follows similarly from the first of the relations (19.7) that 


| 


DIFFERENTIAL BOUNDARY PROBLEMS 


lim D(A, v)e@(v) = o. 

Ae 
If \,*A,, therefore, each member of the relation (20.3) approaches a limit as 
A—A,, and the limiting form of the relation as it is given below under (20.5) (**) 
may be regarded as established whenever the solutions 3°” (x, v) and (x, v) 
are associated with distinct characteristic values. It is to be shown that the 
relation (20.5) is valid also when A,=),, provided the solutions involved are 
each a member of a normal pair. 

When A,=A,, the limit of the right-hand member of the relation (20.3) is 


(20.4) — bf) (v)Da(rg, ve@(r). 


If 3‘”)(x, v) and y‘(x, v) are not members of the same normal pair, the char- 
acteristic value in question is of the index two, and by the convention adopted 
in §19, pg. The vectors b‘”(v) and ¢“(v) are in this case evaluated by the 
formulas (19.13), with (p, g) identified either with (s, s+1) or with (s+1, s). 
Under either alternative it is found directly that the limit (20.4) is 0, and 
hence that the relation (20.5) below again maintains. 

When p=g the limit (20.4) is easily found to be 1 if the characteristic 
value is of the index two. The vectors b‘”(v) and c(v) are then as given by 
the formulas (19.13) either with p=q=s, or with p=q=s-+1, and the result 
is immediate. If the characteristic value is of the index one, the expression 
(20.4) for the limit is conveniently replaced by 


»)D(Ap, 
») 


which is its equivalent, since D(\,, v) =0. The vectors b‘”(v) and ¢‘)(v) are 
in this instance evaluated by the formulas (19.11), and with these values the 
limit in question is found, as has been stated, to be 1. 

The solutions of adjoint boundary problems which are members of normal 
pairs thus fulfill the relations 


— Dry, — 


0 


(20.5) 


It will be observed that in the absence of the indicated sums in their left-hand 
members, these relations reduce to the expression of a familiar property of 
weighted bi-orthogonality of the solutions involved. This reduction evidently 
maintains whenever 7,;=0, namely whenever the boundary problem given is 


(2) In which 3,,=0 if p¥q, 


1943] eC 335 
fi 1 


336 R. E. LANGER - [March 


one in which the boundary relations are independent of the parameter A. The 
relations (20.5) may, therefore, be looked upon as generally expressive of a 
property of the solutions of which bi-orthogonality is a specialization. 


CHAPTER 7 
EXPANSIONS IN SERIES OF CHARACTERISTIC SOLUTIONS 


21. The formal expansions of arbitrary vectors. When the parameter v 
is on a regular path, and v0, all characteristic values, as has been seen, 
satisfy the relation (15.6), and are therefore simple roots of the characteristic 
equation. Every such value is, therefore, of the same index as multiplicity, 
and the characteristic solutions of the adjoint boundary problems accord- 
ingly have the property that they may be adjusted to appear without excep- 
tion as members of normal pairs. It is essential for the continuing discussion 
that this property be invariably present, namely also at y=0. Since the 
boundary problem is then as originally given, the inherence of the property 
in it must be a matter of assumption, and this it will be made by the following: 


HyportuEsis 4. The given boundary problem is one for which every character- 
istic value is of an index equal to its multiplicity. 


On the basis of this hypothesis it may, and will be, understood in the 
following, that the.designations 3‘)(x, v), y‘”(x, v), are reserved to solutions 


of normal pairs. 
If with any sequence of scalar coefficients a,(v), p=1, 2, 3, - - - , the series 


of characteristic solutions in the equation 


(21.1) x ») = f(z, »), 


is convergent uniformly on the interval 0 <x <1, and defines there the vector 
f(x, v) as shown, and if over and above that the related series in the equations 


(»), 


p=1 


(21.2) 


v) 0), l= 1, 2, 3, 1), 
p=1 

also converge and define the indicated vectors f(y), the coefficients in 

question necessarily fulfill in turn the relations 


0 
(21.3) 


p = 1, 2, 3, 


1943] DIFFERENTIAL BOUNDARY PROBLEMS 


in which the abbreviations 
(21.4) f(r) = F(0, = f(1, »), 


have been used. This may be established, simply by substituting for the 
respective vectors f(x, v) and f(y) in the formulas (21.3), their equivalent 
series (21.1), (21.2), interchanging the orders of integration and summation, 
and applying then the relations (20.5). 

With the formulas (21.3) thus at hand, the stated conditional basis, upon 
which their relation with the equations (21.1) and (21.2) has been made 
evident, may be abandoned. If with an arbitrary vector f(x, v), whose com- 
ponents are integrable as to x, an arbitrary auxiliary set of vectors f‘-(v) is 
taken to be associated, the formulas (21.3) relate to these vectors a sequence 
of scalars a,(v) as indicated. With these scalars as coefficients, the series of 
characteristic solutions 


(21.5) > »), 


is formally determined, and will be referred to in short as an expansion of the 
vector {(x, v). Inasmuch as this definition of an expansion is wholly formal, the 
question of the convergence of such an expansion must manifestly be regarded 
as an open one. More generally, the amenability of any given expansion to 
evaluation by “means of summability” of any specific type would be a matter 
calling for investigation, as would also all questions hinging upon the relation 
which any value thus conventionally assigned to an expansion may bear to 
the original generating vector f(x, v). 

22. The expansions as series of residues. In terms of any analytic non- 
singular solution of the matrix equation (4.1), and the corresponding matrix 
D(A, v) given by the relation (5.6), let G(x, &, A, v), which is to be known as 
the “Green’s” matrix, be defined by the formulas 


G(x, ») = Y(x, (A, ») YO, 
forO SéS 
(22.1) 
G(x, &, A, = — Y(x, A) (A, ») Y(1, A), 
forx<&5 1. 


At any set of arguments (x, £) this matrix, as a function of \, is analytic except 
at the characteristic values, where singularities are introduced through at 
least some of the elements of the matrix D-"(A, v). These singularities are 
poles, as may be seen from the formula 


deo(d, v)/D(A, v) dis(A, v)/D(A, 


22.2) D1») =(_ »)/DQ,») d(A, »)/DQ, ») 


337 


338 R. E. LANGER ~* [March 


and moreover poles of the first order whenever, as is here the case, each char- 
acteristic value is of a multiplicity equal to its index. The residues are there- 
fore non-vanishing, and may, in the case of the Green’s matrix, be computed 
from either one of the formulas (22.1), since the difference of the right-hand 
members of these formulas is the matrix 9)(x, A4)}—#(&, A), and is thus analytic. 
With the notational choice of the prefix “res,” to indicate, for any matrix to 
which it is applied, the residue at the characteristic value \ ,, it follows, there- 
fore, from the formulas (22.1) that 


(22.3) resy G(x, d, ») = D(x, Ap) {resp ») }B(Ay, v)Y(O, Ap) D-H(E, Av). 


With any choice of the characteristic value \,, and with \ distinct from it 
but in a suitably small neighborhood of it, the identity 


DA, 


is an obvious one. Its limiting form as A—A, is contingent upon the mul- 


tiplicity of this characteristic value. If \, is simple, the formula (22.2) shows 
the limit to be 


res, D-1(A, ») = ¥) — dia(Ap, 
Dy(Ap, ¥) \— dar(Ay, ») 


whereas it may be seen directly when the characteristic value is multiple, say 
when A,=A>p41, that the limit is 


resp ») = {Da(Ap, 


With these two alternatives there are associated respectively the formulas 
(19.11) and (19.13), and from them it may be verified that 


(22.4) resp ») = D> } (44), 


in which the sum indicated upon the right consists of the single term for 
which h=>, or of the pair of terms for which h=p, p+1, according as X, is 
simple, or Apy=A y4:1. The substitution of the result (22.4) into the formula 
(22.3) leads, in virtue of the relations (19.8) and (19.9), to the conclusion 
that 
(22.5) resp G(x, & A, = {— (x, »)}. 

Consider now, in the case of any expansion (21.5), the term, or pair of 

terms, associated with any characteristic value. Since a,(v) is a scalar, and 


() The vectors are to be regarded as matrices for the purposes of the multiplications indi- 
cated. Thus {c(»)b®(»)} is a square matrix. 


1943} DIFFERENTIAL BOUNDARY PROBLEMS 339 


because of the evaluation (19.8), the formula (21.3) leads directly to the 
equation 


1 


— (x, Ap)e(v)b(v) {WOM Ay, + BAP }, 


and this, together with the results (22.4) and (22.5), yields the relation 


an(v)y™ (x, v) TeSp G(x, 2, PYR(EFCE, 


+ res, )D-(A, [Bo v) f(y) + Bay, 
l=0 


(22 .6) 


The terms of any expansion (21.5) are thus expressible as residues in the 
complex plane. It follows from this, of course, that any finite set of such 
terms may be summed by a suitably designed contour integral as to X, the 
contour of integration being chosen to avoid the characteristic values, and to 
enclose precisely those which are associated with the terms of the set in 
question. In §17 a certain infinite sequence of contours C, was defined, any 
one of the sequence, C,, enclosing precisely those characteristic values \, for 
which lA» < | . If for the values of ” there concerned, the initial partial 
sums of the expansion (21.5) are denoted by 8(x, v, m), in the manner 


(22.7) 8(x,v,n) = > v), 


it follows that these sums are evaluated respectively by the formulas 


0 Cy 
(22.8) 


+ — Dx, v) v) FOP) + BLA, Jar. 


Ca 


23. On matters of convergence, divergence, and summability. When the 
parameter v is on a regular path, and v0, the formulas (21.3) associate with 
any suitable vector f(x, v) an expansion (21.5) in solutions of a boundary 
problem of the regular type. Such expansions, both in the vector form here in 
question, and in the alternative scalar form(*), are familiar, and it is known 


(8) For a discussion of the relations between the scalar and vector formulations, cf. the 
author's paper, The expansion problem in the theory of ordinary linear differential systems of the 
second order, Trans, Amer. Math. Soc. vol. 31 (1929) p. 887. 


340 R. E. LANGER - : [March 


that their properties are broadly exemplified by those of the classical Fourier’s 
series. In particular, if x is an interior point of the basic interval, and if in 
some neighborhood of it the components of the generating vector f(x, v) are 
of bounded variation, the expansion converges at this point to the average 
value, namely 


(23.1) 8(x, », m) = (1/2){ f(x +, ») + f(x —, »)}, 0(*). 


These statements, on the other hand, do not ordinarily apply when v=0. 
The expansions are then relative to the given boundary problem, which is 
highly irregular, and little theory of such expansions is known. Indeed, as to 
boundary problems of the second order—the only ones here immediately 
pertinent—all highly irregular cases that have been analyzed at all are sub- 
sumable in the scalar form 


y"(x) — (2d cos px/g)y’(x) + y(x) = 0, 
(23.2) (1 — a)y(0) + ay’(0) = 0, 
byy(0) + bey’(O) + bsy(1) + buy’(1) = 0, 


with constant coefficients, and in particular with a equal to either 0 or 1, 
and with p and gq relatively prime integers(“). Moreover, definitive results 
(uniform convergence), even for the expansions based upon these restricted 
systems, have beeri obtained only for highly specialized generating functions, 
specifically only when these functions are of the structure 


f(x) = 


with ¢(z) some analytic function of the complex variable z which is bounded 
in the circle | s| <1i(*). The disparity between theorems such as these, and 
those which comprise the theory of expansions relative to regular sey 
problems needs no emphasis. 

That the expansions associated with highly irregular boundary ots 
are in general divergent, even when the generating functions are analytic, is 
observable from the simplest of explicit instances. Thus the expansion gener- 
ated by the function f(x) =1 relative to the boundary problem (23.2) with 
a=0, &=1, b;=0, 7=1, 2, 3, is found to be 


sin (nx — c)x sin (mx + 


ne — nx +c 


n=l 


(*) A proof of this is also implicit in the deductions of Chapter 8 below. 
(5) J. I. Vass, loc. cit. 
(*) For highly irregular boundary problems of order higher than the second, the known 
expansion theorems are similar and of comparable generality. They refer exclusively to boundary 
problems in which the differential equation is of the form y)(x)+ {A*+-0(x) } y(x) =0, 23, 
or some specialization of this form. Cf. J. W. Hopkins and L. E. Ward, loc. cit. 


. 


1943] DIFFERENTIAL BOUNDARY PROBLEMS 341 


with c=pm/q. On the interval 0<x <1 the terms of this series fail to approach 
zero as n+. The generating function f(x) =x**' leads to a similar result 
whenever & is not an integral multiple of g. 

It will be clear, even from these fragmentary citations, that potentialities 
of fruitfulness in application to expansions of the type (21.5), with generating 
vectors that are in any broad sense arbitrary, are to be sought only in 
schemes of evaluation which refer to, and are based upon, some notion more 
general than that of ordinary convergence. In the following, two such modes 
of summation, each one of a familiar pattern, are to be defined. 

To begin with, since the boundary problem originally given is identified 
with the parameter value y=0, the expansion generated relative to it by a 
vector f(x), is obtainable from the formula (21.5) by the identification of 
f(x) with f(x, 0), and is, thus 


(23.3) a,(0)y?(x, 0). 

Now under the hypotheses to which the boundary problems have already 
been subjected, the characteristic values, and hence also the characteristic 
solutions, are continuous as functions of »v along any regular path, inclusive 
of the terminal point v=0. If the vectors f(x, v) and f%-”(v) are, therefore, 
likewise taken to be continuous in », and such that 


(23.4) lim f(x, ») = f(2), 


it is evident from the formulas (21.3) that each individual term of the ex- 
pansion (21.5) is continuous and approaches the respective term of the 
expansion (23.3) as a limit when v0. This latter series may, therefore, be 
regarded as formally given by the expression 


lim lim 8(x, v, 


The expansion (23.3) is now to be defined as summable by the “means A” to 
the vulue 


(23.5) lim lim 8(z, ), 


if and when with some determination of f(x, v) and f%:»(v) as vectors continuous 
in v and fulfilling the relation (23.4), the limit (23.5) exists. 

The notion of summation basically involved in these means, will be recog- 
nized as that which similarly underlies the classical means identified with the 
names of Abel and Borel. For these latter may be looked upon as evaluating a 
series 


(23 .6) Uy(x), 


a 
‘ 


342 R. E. LANGER © 
respectively, by the limits 


lim lim > (1 — v)?u,(x), 
v—0 


pel 


n 
By the use of the relations (23.1) and (23.4) in conjunction with the ex- 
pression (23,5), it will be evident that the deductions of the preceding sec- 
tions have effectively established the following facts. 


THEOREM. The expansion (23.3), generated by an integrable vector f(x), ts 
summable by the means A to the value 


(23.7) (1/2) {f(x+) + f(x—)}, 


at any point x of the interval 0<x <1, whenever f(x) is such as to admit at that 
point of representation in the manner (23.4), by a vector f(x, v) which for every 
v on a regular path other than v=0, fulfills the conditions 

(i) that it is continuous in v; 

(ii) that its expansions relative to regular boundary problems converge to the 
value 


(1/2) { f(a+, ») + f(x—, »)}. 


Since the role of f(x, v) in this theorem may in particular be taken by the 
vector f(x) itself, provided it fulfills the condition (ii), the following speciali- 
zation of the theorem is evident. 


CorROLLARY. The expansion of an integrable vector f(x) relative to the highly 
irregular boundary problem is summable by the means A to the value (23.7), 
whenever f(x) is such that its expansions relative to regular boundary problems 
converge to that value. 


A second scheme of summation alternative to that described above may 
be defined in the following manner. 

The expansion (23.3) shall be said to be summable by the “means B” to the 
value 
(23.8) lim 8(x, vn, ), 


no 


if 

(i) with the role of f(x, v) taken by the vector {(x) itself, and the vectors f‘%- 
independent of v; 

(ii) with the points vj, vd. vg, --- on some regular path of v; and 

(iii) with 


and 
| 
rT 


DIFFERENTIAL BOUNDARY PROBLEMS 


lim », = 0, 
pro 


the limit (23.8) exists. 

The means for summation of an expansion as thus defined may be seen 
without difficulty to bear in principle a resemblance to the classical means of 
Cesaro and Riesz. For these latter may be formulated respectively as assign- 
ing to a series (23.6) the evaluations 


lim >> {1 — (p — p(x), with = 1/n, 


pe] 


lim {1 — w(p)vn } up(x), with = 1/w(n), 


no 


the function w() being positive, increasing and unbounded. 

Of the two schemes thus described, the means B may be characterized as 
providing a subtler mode of summation than the means A, in much the same 
sense as the means of Cesaro may be regarded as less drastic than those of 
Borel. It is only consistent with this, that no inference of summability of a 
highly irregular expansion by the means B is readable from the deductions 
already made. It is upon this point that the continuing discussion is focused. 


CHAPTER 8 
THE SUMMABILITY OF THE EXPANSIONS BY THE MEANS B 


24. The formula for the partial sums. For any index n which identifies a 
contour of the sequence C, defined in §17, the terms of an expansion that 
correspond to the first characteristic values, are summed by the formula 
(22.8). The role of this formula in any analysis of the expansion is, of course, 
a central one; its convergence as m—+, with either » fixed or v suitably de- 
pendent upon , being tantamount respectively to the convergence or the 
summability of the expansion. The convergence of these sums, with appropri- 
ately disposed parameter values, must, therefore, in due course be considered. 
Preparatory to this, however, it is to be shown in this section that the for- 
mula (22.8) may be expressed in such a manner as to display, among other 
things, the fact that its elements are bounded as to m when » is bounded from 
zero. This is obscured in the formula as it stands, due to the fact that certain 
of its matrix factors have elements that are polynomials in A, while others 
involve exponentials each one of which is clearly unbounded for some range 
of arg A as n> 

The Green’s matrix has been defined by the formula (22.1). For the pur- 
pose of giving alternative expressions for it, let the matrices 3, be defined 
for all subscripts r by the formulas 


1943] 343 

and 


R. E. LANGER * 


(24.1) 


3, if (mod 4), 


and let the relations 
go when 2, 
= 


(24.2) 
— when 31, 


specify the matrices (x, £). It is then a matter of simple verification that, 
irrespective of the choice of 7, the formulas (22.1) are replaceable by the 


relation 
(24.3) G = + Y(x) — 42} TOC. 


In virtue of the formula (5.6), the relation (24.3) is independent of the 
choice of 9)(x, A) as a nonsingular solution of the equation (4.1). This solution 
may, therefore, be chosen at any value of \ to be one to which the formulas 
(4.8), (4.9) apply. It will be supposed throughout that the solution 9)(x, d) is 
always so chosen. The elements a;i(A, v), given by the relations (5.3) and (5.9), 
are then specific, and the matrix D(A, v) is subject to the formula (5.10). 

Let the matrix Do(A, v) be specified by the definition 


(24.4) 


( + — doe — 
Do = 


— Go, — + 


and let Do designate its determinant. A comparison with the formula (5.10) 
yields then the evaluations 
1 
(24.5) 
D = 
Now it was observed in §20, that the elements of the matrices @“-(A, pv), 


which occur in the formula (22.8), are polynomials in \ of maximum degree 
7;—1—1. In accordance with this, the matrices W(X, v), as defined by the 


relations 
(24 .6) = h= 0, i; l= 0, 1, 2, (r1 1), 


have elements that are polynomials in 1/A, and it is clear that in terms of 
them 


(") Throughout the remaining discussion the explicit indications of functional arguments 
will be curtailed in the interest of simplicity in the formulas. Those variables that do not require 
current attention will therefore frequently be omitted. 


344 [March 
00 acs 
10 00 


1943] DIFFERENTIAL BOUNDARY PROBLEMS 


(24.7) 


Since by the relations (4.8), (5.3), (5.9), and (24.5) the evaluations 
1 
D7VOPY(O) = — Dolais), 
Do 
1 
DI VOY(1) = D. Do( ai, i+2)E(1), 


also maintain, it will be recognized that the formula (24.3) is alternatively 
expressible in the form 


1 
(24.8) = + V(x) Dof{ (ais) Be — (04,542) E(1) S42} 
0 


The elements of the matrices B(x, A) and $-"(£, A), which enter into this 
through the solution 9) and its inverse, and hence also the elements of the 
matrices (5.9), are, as has been observed in §5, asymptotically representable 
by formal power series in negative powers of \. Through the relations (24.7) 
and (24.8), the expression of the integrands in the formula (22.8) without the 
utilization of any positive powers of \ has thus been attained. 

For the further analysis of these formulas, let the matrices 9),(x, A) be 
defined, for all indices r, by the formula 


(24.9) d) = P(x, + 
The relations 

= {E(1)9, + Sera}, 
= (1) + 


follow at once, and as a consequence the formulas (24.8), (24.7) may be re- 
written into the forms 


G = + 


(24.10) 


(24.11) 


in which 
a, = {(a:)3- 542) 
(24.12) 


1 
{€(1)3; + S42} Do. 
Do 


345 
— 


346 R. E. LANGER’ [March 


On the contours of the sequence C,, let the arcs C,, be defined by the 
relations 


(24.13) — — argT,-1 S arg S — — argI,. 


Any two contiguous arcs of this set comprise a semicircle, and those associ- 
ated with any four successive values of r constitute a complete contour. Upon 
associating with each arc C,, the respective evaluations (24.11), the formula 
(22.8) may now finally be expressed in the form 


(24.14) (x, v, m) = 8o(x, m) + 8:(x, », m) + 80(x, m), 
with 
ol x, ddd: 
2) = J. [D(2) JORCO 


(24.15) », m) = 


r=1 Car 


$2(x, v, n) = > AH 4 
Coy l=0 

Of the matrices which enter into these formulas those designated by Y, 

are shown by the relations (24.12) to involve no exponentials, and to be 

bounded for all large values of |\|. The matrices U,, on the other hand are 


less simply constructed. By the formula (18.1) and the second one of the 
relations (24.5), the equality 


1 (— 1)*'B,(, 


(Ve—V 1) 


is established, and in virtue of it the matrices in question are found to be 
explicitly as they are given by the table: 


r 1 3 4 


(24.16) By —B: By 0 —B, 0 
») ear) a ca(v) \O calv) ( 1 


Consider the exponentials which occur in the elements of any one of the 
matrices 


(24.17) Diz), Deal, Ue 
They are in every instance of a form which, in terms of the abbreviations 
(24.18) Pz’, x”) = T(x’) — Tx"), 


may be written as 


- 


1943] DIFFERENTIAL BOUNDARY PROBLEMS 347 


(24.19) s=rr+1, 


with the arguments a, 8 each taking some one of the values 0, &, x, 1('*). 
Moreover, when they are so written, the subscripts s being related to the 
index r that identifies the arc of integration as is indicated in (24.19), then in 
every instance the relation 8 2a is fulfilled. It will be shown that because of 
this every one of the exponentials in question is bounded over the range of 
integration for which it is involved. 

From the formulas (2.5) and (2.7) it follows that when 8 2a, then 


a) = | T,(6, a) | efare Ty, 
Under the substitutions given by the relations 


A= F arg 


the upper or lower signs maintaining according as s=r or s=r-+1, the arc 
Car corresponds to the range 


(24.20) 0506S w,-1, 
in which the angle w,_; is that given by the definition (11.8). The evaluations 
(24.21) | | = sine, for B 2 a, 


then show that on this range the exponentials concerned are bounded as to n. 
It follows that with the possible exception of Ul, the matrices (24.17) are all 
bounded on the respective arcs of integration involved in the formulas (24.15), 
and that unboundedness can inhere in the matrix Ul, only through the scalar 
factors indicated in the table (24.16). 

By the relations (18.2) the scalar functions B,(A, v) are shown to be 
bounded as to uniformly as to v, provided the range of v is restricted to the 
path segment (v9, v,) when A is on the contour C,. This-condition is fulfilled 
in particular when v is boundéd from zero, and since in this case the bounded- 
ness of the coefficients 1/c,(v) is also assured, the uniform boundedness of the 
elements of the matrices U,(A, v) follows. If vy is not bounded from zero, on the 
other hand, this conclusion may not be drawn, since it is the earmark of any 
highly irregular boundary problem that at least one coefficient c,(v) ap- 
proaches zero with v. However, from the formulas (12.2) and (8.4) it is seen 
that at all events the functions v/c,(v) are bounded. It may accordingly be 
inferred that the elements of the matrices 


(24.22) »)}, 


are uniformly bounded for v on the path segment (vo, v,) when A is on the 
arc Cyr. 


(8) It may be noted that I';(x’)=T;j(x’, 0), and that I'j(x’, x’’)=Tj42(x’’, x’), since 
T(x’) = —Tj42(x’). 


348 R. E. LANGER ~ {March 


25. Lemmas. An analysis of the terms in the relation (24.14), as these are 
given by the respective formulas (24.15), may be based in large measure upon 
certain auxiliary deductions, of which some may be regarded as elementary, 
whereas others are appropriate adaptations of classical convergence theo- 
rems. The isolation and specific formulation of these deductions is a matter 
of evident convenience for their later applications. They will, therefore, be 
set forth in this section in the form of lemmas, in the interpretation of which 
it shall be understood that: 

(i) Any interval designated by (a, 8) is such that 0Sa<#31; 

(ii) The symbol Yar is a designation of the semicircle composed of the 
arcs C,,,-1 and Chr; 

(iii) The range of the index m is the sequence of integers for which the 
contours C, have been defined; 

(iv) For any value of m the range of the parameter v is the segment 
(vo, ¥n) Of some chosen regular path; 

(v) The range of the variable £ is in every instance an interval (£), &) for 
which 

(vi) Relative to any interval (é, &) and any sequence of arcs Ynr, the 
symbol M(z, A, v) is a generic designation for matrices whose elements 
m,(£, A, v) are uniformly bounded, namely which fulfill some set of relations 


(25.1) | ») | S 


in which the p;; are positive constants. 
(vii) The symbol $,(&’, £’’, &, g) is defined by the formula 


(25.2) & = f d, 


Yar 


Lemma 1. The elements of the matrix 


dy 
(25 .3) f ME, A, v) 


1 
Yar Act 


are bounded uniformly as to —, v and n if q20, and if g>0 they approach sero 
uniformly as to — and v, as n>, 
The asserted facts are obvious in virtue of the relations (25.1). 


2. If 8 >a, the elements of the matrix 
(25 3-(B, g, q) 


are bounded uniformly as to §, v and n if q2=0, and if g>0 they approach zero 
uniformly as io — and v, as n> &, 


Since any arc Cy, is identified with the respective range (24.20), the rela- 
tions 


1943] DIFFERENTIAL BOUNDARY PROBLEMS 


(25.5) ind j= 0,1, 
@r—1—j 


maintain on the respective portions C,,,-; of the arc of integration Yar. By 
virtue of the evaluations (24.21), it is therefore seen that the elements of the 
matrix (25.4) are dominated by those of the sum 


sin Wr—1— i 


exp {- An|T+(8, a) | d8(p;;). 


j=0 0 r—1—j 


An explicit integration shows these dominant elements to be at all events 
bounded as to ”, and to approach zero if g is positive. 


LEMMA 3. The elements of the matrices 
(25 .6) 3-(8, & 9), with & <8, 
and 
(25.7) SE, 9g), with a<h, 


are bounded uniformly as to £, v and n if gq20, and if g>0 they approach zero 
uniformly as to and v, as 


Since in the identity 


the final exponential is bounded on the arcs Ynr, as is shown by the evaluations 
(24.21), the matrix (25.6) is in fact of the form (25.4) with a=. Similarly 
the matrix (25.7) is of the form (25.4) with 8=é. The assertions therefore 
follow from the Lemma 2. 


LemMaA 4. For aS§,<&&<8 the matrices 
& 
(25.8) f a, & 1)d8, 
& 


and 


& 
(25.9) & & 


approach 0 as n— ©, uniformly as to &1, & and v. 


By virtue of the relations (24.21) and (25.1), the elements of the matrix 
(25.8) are respectively dominated by those of the sum 


1 
— in 0 d0dé. 
exp Aa 0) | sin 0} 


349 


350 R. E. LANGER ° [March 


These latter are, however, in turn uniformly dominated by the elements of 
the matrix sum, which, with any suitably small positive ¢, is given by the 


formula 


j=0 
@r-1-i sin 
+ f f exp {- An| Tea + | 
a+e 0 


@Wr—1—j 


In these matrices the elements of the first two may be made arbitrarily small 
by the choice of e, and those of the remaining ones are then arbitrarily small 
when m exceeds some specific value, as is shown by explicit integrations. The 
convergence of the matrix (25.8) thus follows, and a similar argument estab- 
lishes the fact for the matrix (25.9). 

Lemma 5. If & (&, », m) is any matrix such that: 

(i) its elements are uniformly bounded ; 

(ii) for aSii<&<B the matrix 


(25.10) R(E, v, m)dé 
& 


approaches 0 as n—+~, uniformly as to &, & and v; the relation 


8 
lim R(E, v, m)F(EdE = 0, 

maintains uniformly as to v, for every vector {(£) whose components are integrable 
over the interval (a, B). 

This is an immediate consequence of a familiar general convergence the- 
orem(?*), 

Lemna 6. If R(&, v, m) is any matrix such that: 

(i) for aSi:<i& <8, the elements of the matrix (25.10) are bounded uni- 


formly as to &:, &, v, and n; 
(ii) for a<&;<& <8, the matrix (25.10) approaches 0 as n—~, uniformly 


as tov; 
(iii) fora<&<B 


lim RE, v, = L.(v); 


no a 


(iv) fora<ii<6 


(#*) E. W. Hobson, The theory of functions of a real variable, Cambridge University Press, 
Vol. II, 1926, p. 422. 


DIFFERENTIAL BOUNDARY PROBLEMS 


8 
lim v, = Vplv); 
avo 


the relation 4 
lim RK(E, n)f(é)dé L.f(a+) + 


no 


maintains for every vector {(£) whose components are of bounded variation on the 
interval (a, B) 

This is an evident formulation of a familiar theorem in singular inte- 
grals (7°). 

LemMMA 7. If 8(&, v, m) is any matrix which fulfills the specifications (i) 
and (ii) of Lemma 6, the relation 


8 
lim R(E, v, = 0, 


maintains uniformly as to v, for every vector {(£) whose components are of 
bounded variation on the interval (a, B), and for which f(a+) =o, f(8—) =o. 


The argument by which the Lemma 6 is established, serves also to prove 
the assertion here, the conditions (iii) and (iv) of the Lemma 6 being dis- 
pensable because of the vanishing limits of the vectors f(€) concerned, at 
and 

On the basis of these lemmas an analysis of the expressions (24.15) is to 
be given in the remaining discussion. Consistent with the prime purport of 
this, which is ultimately to establish summability of the expansions by the 
means B, it will be assumed henceforth that all vectors f(£) and f%” which 
are brought into question are independent of v, and that the vectors f(é), 
moreover, all have components that are integrable over the interval (0, 1). 
The point x at which an expansion is considered will always be regarded as 
fixed. Since the analysis which applies when x is an end point of the basic 
interval (0, 1) differs materially from that which is applicable when x is an 
interior point, these cases will be separately discussed,—the latter in §§26 
and 27, and the former finally in §28. 

26. The convergence of the vector 8o(x, 2), when 0<x<1. With the use 
of the evaluations (24.2) of the matrices $(x, £), the formula for the vector 
8o(x, m), as it is given by the first one of the relations (24.15), is found, after 
the collection of similar integrals over abutting arcs of integration, to be ex- 
pressible in the form 


(26.1) &(2, = f (6, + = f ROG n)f(dé, 


(2°) Cf. Hobson, loc. cit. pp. 446-448. 


1943] 351 


352 R. E. LANGER ° 


the matrices R(£, m) being given explicitly by the formulas 


(26.2) KOE, =— 


Yar 


and 


(26.3) n= f Fo AA. 


Consider the matrix (x, £). Since by the formula (4.8) the matrix 
¥)(x) is factorable in the manner $(x)€(x), whereas the identities 


(26 .4) E(x) = r= 1,2, 


maintain, as may be easily verified, it is seen that the formula (26.2) is given 
somewhat more explicitly by the form 


2 
(26.5) ROE, mn) = — (EAA. 

Now by the formula (4.9) the matrix (x), and hence also its inverse, differs 
from the unit matrix by a term which is uniformly of the order of 1/A. Aside 
from its scalar exponential factor, the integrand shown in the relation (26.5) 
is, therefore, of the form 


SoM (E) + (1/A) ME, d). 
With the use of the relations 
(26 .6) = (Sor, 1, 2, 


which follow from the fact that the functions r;(&) which are elements of the 
matrix #R(£) can also be expressed respectively as r(é), the complete inte- 
grands in the formula (26.5) are, therefore, seen to be of the structure 


(E) Sor + é, é, 1). 


An integration with respect to £ accordingly yields the relation 


dx dx 
RK (E, n)dt = — f — 


1 nr Yar 


(26.7) 


Let x now be fixed upon as any point in the interior of the interval (0, 1) 
in some neighborhood of which the components of the generating vector f(£) 
are of bounded variation. With a suitable determination of ¢ as a positive 
constant, the neighborhood in question contains the interval (x —e, x+e). It 
will be supposed in the following, that an e has been determined upon which 


[March 
2 


1943) DIFFERENTIAL BOUNDARY PROBLEMS 353 


fulfills this condition. Then, on the one hand, if the points &, &, and & lie 
upon the interval (0, x—«), the integrals in the formula (26.5) are of the 
form (25.6) with g=0 and 8=x, whereas each integral in the formula (26.7) 
is either of the form (25.6) with g=1, or of the form (25.9). It follows from 
the Lemmas 3 and 4 that the matrix R™(€, 2) fulfills the hypotheses of the 
Lemma 5, relative to the interval in question, and hence that 


(26.8) lim f = 0. 


On the other hand, if the points & and & are taken to lie upon the interval 
(x—, x), it is found similarly by the use of the Lemmas 1, 2, and 4, that the 
matrix R(é, 2). fulfills the hypotheses of the Lemma 6, with a=x—e, B=x, 
and with 2, =0, {=71$. It follows, therefore, that 


(26.9) lim RO (E, m)f(EdE = wif(x—). 


If the consideration is now turned to the matrix R“(&, 2) with x <é, the 
reasoning given may be essentially repeated. It is found, thus, on the basis of 
the relations 


E(x) EME) = 


26.10 


that 
(26.11) lim n)f(EdE = 0, 


z+e 


and that 
zte 
(26 .12) lim RY (E, n)f(E)dE = wif(x+). 


The convergence and limiting values of the terms of the formula (26.1) have 
thus been established, the results admitting of summary in the following form. 
The vector 8o(x, m) converges as n— © to the value 


(26 .13) (1/2) + f(a+)}, 


at every point x which is in the interior of the interval (0, 1) and in some neighbor- 
hood of which the components of the generating vector {() are of bounded variation. 

27. The summability of the expansions at interior points of the interval 
(0, 1). By the second one of the formulas (24.15) the product of the vector 
$:(x, v, m) by v is expressible in the form 


1 


354 R. LANGER 
with the kernel matrix (£, v, m) given by the relation 


-1 
It was observed in §24, moreover, that each matrix which appears in any 
integrand of this formula (27.2), has elements that are uniformly bounded 
over the range of integration concerned, provided the parameter vr is re- 
stricted to the respective path segment (vo, v,). Under this restriction upon », 
which is to be imposed and maintained throughout this discussion, the inte- 
grands of the formula (27.2) are thus all of the type Mt(é, A, v), as that has 
been defined in §25. A somewhat more explicit determination of the structure 
of these integrands is requisite, and is obtainable as follows. 

By the formula (24.9), the matrices 9),(x) are found to have, for the several 
indices r, the forms given by the table: 


Y,-(x, d) 
OTLB (x) + OP(x) 
AT OB(x) + OB(x) 


If these forms are substituted into the relation (27.2), and thereupon the 
integrals involving any specific exponential over contiguous arcs of integra- 
tion are collected, it is found that the result may be written in the manner 


4 
(2. M(E, r, v)dd + OT, A, v)dr. 
r=3 


Yne 


(27.4) R(é,», n) = >> 


Ynr 


Specifically the matrix indicated here by M(é, A, v) over any semicircle Yap, 
is identifiable as the product 
(27.5) [B(x) Sar { on the arc Cons j = 0,1. 


Consider any matrix 3(x, X, v) which fulfills a relation 


(27.6) BPE) = 


in which € is a matrix that is constant as to ¢. It may be deduced from the 
equation (4.1), then, that 3(€) is a solution of the adjoint differential equation 


(27.7) = — BOLPRO + 


and from this it follows that with any choice of & and & on the interval (0, 1) 


| 
1 
(27 .3) 2 
3 
+ 
| 


1943] DIFFERENTIAL BOUNDARY PROBLEMS 


(27.8) f = —{3@ 3) + . 


This formula is applicable to the integration of the relation (27.4). Since the 
matrix products enclosed within the square brackets in the expressions (27.5) 
are each of the type prescribed by the designation 3, the integration of these 
expressions is given by the formula (27.8). Since, furthermore, each matrix 
thus designated by 3 is of the type denotable by Mt(é, A, v), the entire 
right-hand member of any resulting relation (27.8) is clearly of the form 
(1/A) ME, A, v), uniformly as to the choice of §&, on the interval (0, 1). The 
formula (27.2) leads, therefore, also to the relations 


ts 2 
f », m)dt = 2, ) 
1 Yar 


(27.9) 

+2 f (Es, dy — 
r=3 


The integrals in the right-hand members of the formulas (27.4) and (27.9) 
may now all be recognized as being of the forms (25.4) with g, respectively, 
equal to either 0 or 1. It follows from this by the Lemma 2 that the matrix 
(27.2) fulfills the hypotheses of the Lemma 5. By that lemma, then, the rela- 
tion (27.1) leads to the conclusion that 
(27.10) lim {v8,(x, », = 0, 

uniformly as to vy on the segment (vo, v,). 

The analysis which has thus been given for the second one of the for- 
mulas (24.15) may be applied equally well, and in a wholly similar manner 
to the third one of these formulas. It yields in this instance an evaluation of 
the form 


Ine 


(27.11) 


and inasmuch as each integrand on the right of this equality is of the type 
(25.4) with g>0, it follows from the Lemma 2 that irrespective of the vectors 
f involved, 


(27.12) lim {»82(x, v, 2)} = 0, 


uniformly as to vy on the segment (vo, vn). 
In virtue of the conclusions (27.10) and (27.12), it may evidently be in- 


356 R. E. LANGER ° 


ferred that there exists a positive monotonic sequence of constants 
€1, €2, €3, ° * * 


which converges to zero, and which is furthermore such that for each index 
n, and for all values of v on the respective path segment (¥%, v,), the relations 


(27 .13) | v8i(x, v, 2) + vBe(x, v, 2) | < €n; 
are fulfilled. On the path of v let the sequence of points 
(27 .14) vi, Ve, 


be determined now so that 
(i) vf lies on the segment (vo, vn), 
(ii) 

(27.15) 

and 
(iii) 

(27 .16) 


The determination of such a sequence is clearly possible. Since the inequalities 
(27.13) are, then, in particular fulfilled when »v=»p,{, it follows from them, 
together with the relation (27.16), that 


(27.17) lim {8,(x, Va, 2) + , n)} = 9, 


This result, taken in conjunction with that of §26, permits of the conclusion 
which follows. 


THEOREM. The expansion generated by an arbitrary integrable vector f(£), and 
an arbitrary set of associated vectors h=0, 1;1=1, 2, 3,---, (t1—-1); ts 
summable by the means B to the value (1/2) {f(x—)+f(x+)} at every interior 
point of the basic interval in some neighborhood of which the components of the 
vector f{(£) are of bounded variation. 


28. The summability of the expansions at the end points of the interval. 
The reasoning of the two immediately preceding sections depends explicitly 
upon the relation 0<x<1, and is essentially inadequate when the point at 
which the expansion is considered is an end point of the basic interval. At 
these points, x =0, and x =1, therefore, distinct considerations are requisite. 
Such are to be given in the following, it being assumed throughout their 
course, firstly, that the given boundary problem has, if necessary, been ad- 
justed in the manner described in §3, so that the relations (3.2) maintain; 
and secondly, that only such expansions are brought into question as have 
generating vectors f{(£) whose components are of bounded variation in some 


[March 
= 0, 
En 
=> 0. 
Vy, 
| 


1943] DIFFERENTIAL BOUNDARY PROBLEMS 357 


right-hand neighborhood of  =0 and in some left-hand neighborhood of = 1. 
A positive constant « may be determined in each case, then, so that the 
boundedness of variation assumed is maintained over each one of the in- 
tervals (0, ¢) and (1—e, 1). It will be supposed that ¢ has been so determined. 
The pair of intervals (0, «), (1—«, 1) will briefly be designated by the symbol 
A, and it shall be understood that the designations f(0) and f(1) signify the 
limiting values f(0+) and f(1—), respectively. 

As a function of ~ the Green’s matrix @(x, &, A, v) is a solution of the 
differential equation (27.7), as may be seen at once from its definition (22.1). 
It is found in virtue of this that the relation 


1 
(28.1) -f = ff + 


maintains, and furthermore that every differentiable vector g(£) satisfies the 
identity 


(28.2) 


l—e 
+ GOs — 6} =. 


The specific vector for which this identity is to be utilized, and to which the 
designation g(#) will hereafter be restricted, is the following one: 


(28.3) = + (1 — 


If the respective members of the identities (28.1) and (28.2) are added, 
and their sum is integrated over the contour C,, the result is an evaluation of 
first integral in the formula (22.8). With a suitable grouping of the terms over 
the several ranges of integration, that formula is, therefore, found to be ex- 
pressible in the form 


(28.4) 8(x, v, m) = f(x) + > v, %), 
k= 
with 


(28.5) 
f + } ar, 
nid c, 


and 


c, 


1 
dt 


358 R. E. LANGER’ 


and with the remaining terms expressible through the matrices 


R(x, i, = —f G(x, A, v) 
c, » 


1 
R(x, é, n) —f é, r, 
Ca 


1 
R® (x, é, n) = —f G(x, é, r, 
2x1 Ca 
by the formulas a 
(28.8) 
= | KRY {OFA — }dé, 
03 f RO {OF— 
and 
804 = f = 9} dé, 
A 


(28.9) 
805 = -f R® {fF — g} dé. 


Consider the formula (28.5). If the indicated integration as to £ therein 
is performed, and the expression f(x) is Tr by its equivalent 


x 
the formula is seen to be preening 


{@(«, 1 —)f(1) — G(x, 0 +)f(0) — f(a) 


Now the definition (22.1) of the Green’s matrix leads to the evaluations 
G0, 0 +) = — YOD' 

G(0, 1 —) = — DBA), 

G(1,0+) = DBA), 

G1,1-)= VA), 

whereas it follows from the relations (20.1), with \’ and \”’ identified, re- 
spectively, as A and 0, that 

(28.12) = BM (A) — BO), 


(28.11) 


[March 


1943} DIFFERENTIAL BOUNDARY PROBLEMS 359 


In virtue of these substitutions the formula (28.10) reduces, irrespective of 
whether x =0 or x=1, to the form 


and this leads finally, cia the evaluations 
Y(x)D-A) = 1,2, 3,4, 


to the relation 


j=l 
The transformation of ie formula (28.6) is more direct. Through the 
mere use of the second one of the equations (24.11), it yields, namely the 
relation 


ri-1 4 


(28.14) vo = — > D(x) » 

Under the restriction of the parameter v to the path segment (vo, »,), each - 
matrix factor which appears in an integrand in the formulas (28.13) and 
(28.14) has been seen to be uniformly bounded over the range of integration 
for which it is involved. Due to the relations (3.2), and by the Lemma 1, 
therefore, it may be concluded that 


lim { n)} = 0, 


(28.15) lim {»€o(x, v, m)} = 0, 


uniformly as to v on the segment (vp, vz). 

On the basis of the first one of the formulas (24.11), and by considerations 
which are now familiar, it is found that both when x =0 and when x =1 the 
definitions (28.7) assure for the matrices (x, &, v, m) and R® (x, v, 
the forms 


Tne 


j= 9,1, 


provided » is restricted to range between vp and »v,. For any choice of & and 
moreover, 


& 


R. E. LANGER 


(xe, E,)dE = — (x, Eo) + (x, 
(28.18) 
— f R(x, 
the latter one of these relations following from the fact that the matrix © 
satisfies the differential equation (27.7). 

The Lemmas 1 and 4 applied to the formula (28.16) with 7=1, show 
readily that the matrix vy‘ (x, £) fulfills the hypotheses of the Lemma 5 with 
2.(v) =D, and &s(v) =O, and with (a, 8) as any subinterval of the interval 
(0, 1). Since the vectors and —9’ (€)} have com- 
ponents that are of bounded variation respectively on the interval (¢, 1—«) 
and the pair of intervals A, it follows from the formulas (28.8) and the 
Lemma 5 that 


lim { n)} = 9, 


(28.19) lim { v8os(x, », = 0, 


uniformly as to vy on the segment (v9, v,). 

The formulas (28.17) and (28.16) with 7 =1, awe readily that the matrix 
vR (x, &) fulfills the hypotheses of the Lemma 7 relative to the intervals A. 
The formulas (28.18) and (28.16) with 7=0, show similarly that the matrix 
v&®(x, £) fulfills the hypotheses of the Lemma 5 relative to the interval 
(e, 1—e). In virtue of the formulas (28.9), and the fact that the vector 
{(é) —g(é) } vanishes at £=0 and £=1, it follows, therefore, lastly that 


lim { n)} = 0, 


lim {v8o5(x, », 2) } = 0, 


uniformly as to vy on the segment (yo, »,). 
The results (28.15), (28.19) and (28.20) evidently insure the existence of a 


sequence of positive constants 4, €, ¢,°-°* which.converges to zero, and 
which is such that the relations 


5 
v >. Sox(x, », < n=1,2,3,---, 
k=O 


maintain, irrespective of how in the mth one of them the value of » is chosen 
on the path segment (yo, v,). In particular, then, these points may be chosen 
as the respective members of a sequence (27.14) which fulfills the relations 
(27.15) and (27.16). For such a choice it is clear that 


360 (March 
and 


DIFFERENTIAL BOUNDARY PROBLEMS 


lim > Vn = 0, 


and this result yields through the relation (28.4) the following and final con- 
clusion. 


THEOREM. The expansion generated by an arbitrary integrable vector {(£) 
whose components are of bounded variation in some right-hand neighborhood of 
the point £=0, and in some left-hand neighborhood of the point §=1, is sum- 
mable by the means B to the vector {(x) at the points x =0 and x=1. 


UNIVERSITY OF WISCONSIN, 
Mapison, WIs. 


1943) 361 


. 


% 


. 


ON APPROXIMATING CERTAIN INTEGRALS BY SUMS 


BY 
C. RAYMOND ADAMS AND ANTHONY P. MORSE 


1. Introduction; the main problem and its setting. In a recent paper(') 
we showed that if E is a measurable linear set (bounded or unbounded), if 
SEL(E£), if & is a fixed real number 21, and if A, and B, (n=1, 2, 3,---) 
are sequences (finite or infinite) of measurable subsets of E satisfying the 
conditions 


A,C 0<|B,| k|A,|, diameter(B,) <5 (m= 1, 2,3,---); 
| BuBa| = 0 for mn; > 3. = E, 


then the following relation holds: 


The present paper is concerned chiefly with the generalization of this re- 
sult obtained by replacing f(x) by ¢[f(x)]. From the statement above it 
is evident that the order of the elements of the sequences A, and B, 
(n=1, 2, 3, - - - ) is immaterial. It is therefore desirable to introduce a nota- 
tion which will be free from any implication of order. To achieve this purpose 
and to enable us to state our present problem with precision we formulate 
several definitions as follows. 


(1.1) DeFinition. For B a non-vacuous linear set the diameter of B is 
sup B—inf B; this is designated by the symbol diam B. For B vacuous diam B 
is defined as zero. 


It will presently be clear that diam B can be interpreted as the essential 
diameter of B without disturbing any of the subsequent results, where by 
essential diameter of B is meant ess sup B—ess inf B when |B| is > 0 and 
zero when | B| =0. 


Presented to the Society September 8, 1942; received by the editors August 22, 1942. 
(?) Adams and Morse, Random sampling in the evaluation of a Lebesgue integral, Bull. Amer. 
Math. Soc. vol. 45 (1939) pp. 442-447, Theorem 4. Hereinafter this paper will be referred to 


as RS. 
A set of real numbers will be spoken of as a linear set. For E a linear set measurable in the 


sense of Lebesgue and of positive measure, L(E) represents the class of functions f the domain 
of each of which is E and for each of which the Lebesgue integral /zf(x)dx exists (finite). For 


SEL(E) we usually abbreviate /xf(x)dx to 
When and only when a linear set E is measurable in the sense of Lebesgue its measure is 


designated by | £|. 
All functions considered ‘in this paper are real-valued. 


363 


BOSTON UNIVERSITY 
COLLEGE OF LIBERAL ARTS 
LIBRARY 


364 C. R. ADAMS AND A. P. MORSE [May 


(1.2) DEFIniTION. For E a measurable linear set and 0<5< & a 5-partition 
F of E is a countable family of measurable, essentially disjoint sets satisfying 
the conditions 


> B=E, diamB<i for BEF. 


BEF 


F will be spoken of as an infinite family if and only if the sets BEF are infinite 
in number. The aggregate of all 5-partitions of E will be represented by T;(E). 


(1.3) DerinitTion. For E a measurable linear set L*(E) is the class of func- 
tions defined by the condition fEL*(E) if and only if the domain of f is E and 


the condition 
f 
B 


ts satisfied for each measurable subset BCE with 
|B|>0, diam B< o. 
Clearly we have L(E)CL*(E), with L(Z) =L*(£) if and only if E is es- 
sentially bounded. 


(1.4) Derinition. For fEL*(E) and B a measurable subset of E with 
0<|B| <~© we define 


mef= f 


The set of numbers of the form Itef, where B is a measurable subset of E of posi- 
tive measure with diam B < ©, will be designated by R(f) and the closure of this 


set by R(f). 


It is easily seen that when |E | is >0, R(f) is an interval, and that this 
interval may be open, semi-open, or closed, as well as bounded or un- 
bounded; the end points of this interval, whether finite or infinite, are 
ess infzex f(x) and ess supzex f(x). The subset of E on which f is not in 
Rf) is of measure zero. 


(1.5) Derinition. For fEL*(E) and ¢ a function whose domain includes 
Rf) we interpret 
| for BC E, | B| =0, 


as zero. Yor 0<8< and FET;(E) we assign to the numerical sum 


dX | By 
BEF 


its natural intuitional meaning (to be made precise later in Definition (2.7)) and 


=z 
4 

E 


1943] APPROXIMATING INTEGRALS BY SUMS 365 
define two limits, which are actually of the nature of a lower integral and an upper 
integral respectively over E, as follows: 

Ss(f, E) = lim inf ¢[Maf]| BI, 


BEF 


S*(f,¢,E)=lim sup ¢[Mz/]| Bl. 
FEI;(E) BEF 


That each of these limits exists (finite or infinite) follows from the fact that 
0<6:5 625 © implies 
When and only when these limits are the same we define 
— © ¢, E) = = S*(f, ¢, ZE) 


Similarly we define two other limits, involving the sampling procedure, for any 
fixed real number k=1:Sx(f,¢,E,k) [S*(f,¢,E,k)] is the of the inf 
[sup] of numbers of the form 


BEF 


where FET ;(E) and A ts a measurable linear set variable with F and for each F 
variable within the restriction 


(1.6) | B| < k| AB| for BEF. 
If and only if these limits are the same we write 
—-os SU, E, k) E, k) S*(f, E, k) S 


(1.7) DEFINITION. For E a linear set, f a function whose domain is E, and } 
a function whose domain includes the range of f, we represent by o:f the function g 
defined on E by the condition 


g(x) = oLf(x)] for x E. 


Our main problem is to determine conditions on f and ¢ which will in- 
sure the (finite) existence of {2x@:f and of S(f, ¢, E, k) and the equality 


(1.8) for = S(f, ¢, E, b). 


Actually the conditions which we obtain will insure 


lim sup | f ¢[Masf]| B| 


FEr;(E) 


1.9 
(1.9) =lim sup > | — ¢[Mas/]|| Bl = 0, 
BEF 


which implies (1.8). 
For convenience of reference we attach numbers to the following particu- 


> $[Mazf] |B|, 


366 C. R. ADAMS AND A. P. MORSE 


lar cases of (1.8) and (1.9) in which no sampling is involved: 


(1.10) fos = S(f, ¢, £), 
(1.11) lim sup | Mag:f — o[Msf]|| = 0. 


FEY,(E) Ber 

The result quoted above from RS shows that the conditions fEL(E), 
¢(y) =y for yER(f) are sufficient for (1.8). In RS the condition (1.9) could 
equally well have been obtained, but it was not. It has also been shown by 
Banach(?) that if E=J, the unit interval 0<x31, and if the sets BET;(J) 
are restricted to be a finite set of intervals, the conditions fEL,(J), p>1, 
¢(y) =|y|” for yER(f) are sufficient for (1.10). 

It should be noted that if ¢ is regarded as defined on the infinite interval 
—«<y<o and fixed, the subclass of L(Z) singled out by the condition 
Jrb:f exists may naturally be called L,(Z), thus generalizing the familiar L, 
classes of functions. This is by no means the first time that such generaliza- 
tions have been considered. In fact they were introduced as early as 1924 by 
W. H. Young(*) who used the term “super-summability” in this connection. 
This notion was further developed, and related problems studied, by Young 
and by others including Burkill, Kaczmarz and Nikliborc, Orlicz, and Birn- 
baum and Orlicz(*). There is no need to describe here these earlier investiga- 
tions; it is sufficient to remark that the objectives of their authors were quite 
different from ours and were such as to require hypotheses on the function ¢ 
more restrictive than those which we shall impose. 

2. Further definitions and preliminaries. We collect here a number of 
definitions which are to be used more or less generally throughout this paper. 
Other definitions, of only local use in the subsequent developments, will be 
formulated when needed. 


(2.1) DeFiniTI0Nn. If Q is a condition involving the symbol a, E.[Q] stands 
for the set defined by the condition bE E,[Q] if and only if Q is satisfied when b 
ts substituted for a. 


(2.2) DeFinition. If A is a set and aGA we define {a} as E,[x=a]. 


(?) Banach, Sur les opérations dans les ensembles abstraits et leur application aux équations 
intégrales, Fund. Math. vol. 3 (1922) pp. 133-181; especially pp. 175-177. We use L,(E) to 
stand for the class of functions f with fEL(E), \f |"EL(E). 

(?) W. H. Young, The progress of mathematical analysis in the twentieth century, Proc. 
London Math. Soc. (2) vol. 24 (1925-1926) pp. 421-434, 

(*) Burkill, The strong and weak convergence of functions of general type, Proc. London Math. 
Soc. (2) vol. 28 (1928) pp. 493-500; Kaczmarz and Nikliborc, Sur les suites des fonctions con- 
vergentes en moyenne, Fund. Math. vol. 11 (1928) pp. 151-168; Orlicz, Beitrége zur Theorie der 
Orthogonalentwicklungen, Studia Mathematica vol. 1 (1929) pp. 1-39, 241-255; Birnbaum and 
Orlicz, Uber Approximation im Mittel, ibid. vol. 2 (1930) pp. 197-206, and Uber die Verallgemei- 
nerung des Begriffes der zueinander konjugierten Potenzen, ibid. vol. 3 (1931) pp. 1-67. 


[May 


1943] APPROXIMATING INTEGRALS BY SUMS 367 


(2.3) DEFINITION. The symbol ~, placed above another symbol, negates the 
meaning of the other symbol; in particular, if A is a set,a & A means that a is 
not an element of A. 


(2.4) DEFINITION. For a and b real numbers with a<b we define [a, b] as 
the closed interval E,|a <x <b]; for a<b the open interval E,|a<x <b] is repre- 
sented by (a, b). The unit interval [0, 1] is designated by I. 


(2.5) Derinition. If F is a family of sets B, we sometimes employ for the 
set serB the abbreviation o(F). 
It should be noted that x €o(F) if and only if there exists BC F with x EB. 


(2.6) DEFINITION. Two sets A and B are called disjoint if and only if AB is 
vacuous (that is, AB=0). A family F of sets B is called disjointed if and only 
if B:B,=0 whenever B,C F, F and B,¥ Bz. 

Adopting the convention that a+ © = 0©,a@— © = — © when — ~ <a<o 
and that « — © is meaningless, we set up the following formal definition for 
numerical sums. 

(2.7) Derinition. If H ts a countable set with N (S @) elements and we 
have — © Sa,S for xCH, the numerical sum 2€ Ha, ts the unique number X 
(—© SAS &), if any exists, such that 


A= Gz; 
jel 
whenever 


N 
H = > with ~ x, fori som<n<N+1. 
j=l 


If His vacuous we understand 


> «.=0. 


It should be noted that < © implies < ©. 

(2.8) Derinit10n. For J a linear interval with 0<|J|< © a 5-partition 
of J in which each set B is an interval (open, semt-open, or closed) with | B >0 
ts called an interval 5-partition of J; the aggregate of all such interval 5-partitions 
is represented by T';(J); and limits corresponding to those of Definition (1.5) are 
designated by 


Sx(f, k), S*(f,¢,J, k), and so on. 


(2.9) DeFinitTIon. For E a linear set the symbols C(E), UC(E), and B(E) 
are used to designate, respectively, the classes of functions f the domain of each 


368 C. R. ADAMS AND A. P. MORSE [May 


of which is E and each of which is continuous on E, uniformly continuous on E, 
and bounded on E. R(J) stands for the class of functions each of which is (prop- 
erly) Riemann integrable on I; R*(I) for the class of functions defined thus: 
SER*(D if and only if for each N>0, where fy(x)=f(x) for 
x€E,|—NSf(x) SN], fv(x)=—N for xCE.[f(x)<—N], fv(x)=N for 
x€E,|f(x)>N], and limy..Jify exists (finite). The notation BC(E) [or 
BUC(E) or BL,(E) or - -- | indicates the intersection of the classes B(E) 
and C(E) [or UC(E) or L4(E) or - - - |. The condition fEL(I) —R(I) means 
that f is an element of L(I) but not of R(1). If A stands for any one of the classes 
mentioned here the relation ¢CA(E) is to be interpreted as meaning that > is a 
function of the class A on E; in contrast, the relation fEA(E) [¢:fEA(E)] is 
understood to mean that f |b:f| is almost everywhere on E equal to a function 
of the class A on E. 


In §3 we prove some theorems in the theory of functions of sets. These 
considerations are made in a generality considerably greater than is neces- 
sary for our particular purposes, since they seem to us to be fundamental to 
that theory and to possess some intrinsic interest. They are applied in §4, 
where we determine certain properties of the sums that appear in Defini- 
tion (1.5). 

In §5 we prove a sequence of theorems of which Theorem (5.12), the 
climax, gives a necessary and sufficient condition for (1.9) when fEL*(£) 
and ¢€C(R(f)). A sufficient condition, in terms of the existence of a convex 


dominant for |¢| and of such nature as frequently to be useful in testing, 
is given in Theorem (5.13). The necessity of this condition is the subject of 
inquiry in §§7, 8 following the establishment of a number of preliminary 


lemmas in §6. 

Obviously the condition f€L*(Z£) is necessary for the mere formulation 
of the main questions as expressed by (1.8), (1.9), (1.10), and (1.11). The ob- 
ject of §9 is to show that the hypothesis ¢€ C(R(/)) is also essential. In §10 
we analyze the circumstances under which the k-hypothesis on the sample 
sets can be relaxed; that is, the condition (1.6) replaced by 


(2.10) |AB| >0 for B F ,(£). 


In §11, we apply some of the earlier results to functions which are of bounded 
variation or are absolutely continuous in a certain generalized sense. 

Our results hold also for a measurable set EZ in Euclidean m-space. Even 
in §6 where the treatment appears to be peculiarly 1-dimensional, one need 
only alter Definition (6.1) to employ n-dimensional measure in connection 
with the function u (and later f) and linear measure in connection with the 
equimeasurable function v (and later g). 

In general definitions, theorems, and displayed relations to which other 
than local reference is made will be numbered sequentially in a decimal sys- 


1943] APPROXIMATING INTEGRALS BY SUMS 369 


tem with the first number indicating a section. A displayed relation referred 
to only in the proof where it occurs will be labeled with a letter, such as (a). 

3. Some theorems in the theory of functions of sets. In this section the 
symbol 0 wil] always stand for the vacuous set. We understand that two 
sets E, and E; are identical, and write E,;= £2, if and only if x CE; implies 
and is implied by x GC E2; otherwise we call the sets distinct and write E, ¥ Es. 
Under this definition of distinctness there is only one vacuous set. An ag- 
gregate of sets will usually be spoken of as a family. 


(3.1) DEFiniTION. A family © is called a a-field if and only if (i) H a count- 
able subfamily of implies o(H) EG, and (ii) BE implies -—BEG. 


If is an arbitrary family, implies BCo(@). 

If © is ao-field, (i) implies 0 =0(0) EG; from (ii) then follows o(G) EG; 
whence we find that B, G, Gimply BiB. Gand B,— = Bi G. 

The simplest and “smallest” o-field is {0}. If S represents Euclidean 
n-space (1 << ©) and ECS is a measurable set (of finite or infinite measure 
in the sense of Lebesgue), the measurable subsets of E constitute a o-field G 
with o(@) =E. 


(3.2) DEFiIniTION. Jf A and B are sets, we define 
A*B=AE [8 C 


The following theorem may easily be verified. 


(3.3) THEoremM. If G is a o-field and BEG, then G+B is a o-field with 
o(G+B)=B. 

(3.4) DEFINITION. For @ a o-field F is called a ©-partition if and only if F 
ts a countable disjointed subfamily of © with o(F)=0(@). The aggregate of all 
@-partitions is represented by &*. 

It may be noted that if © is a o-field, {o(@)} € @*; thus G* is never 
vacuous. Also FE @* implies F— {0} EG". 


(3.5) DEFINITION. If © ts a o-field and f a function whose domain includes 
G, f is said to be quasi-additive on © when and only when 


sup | f(B)| < 
FEO" | 


A well known theorem on the summation of absolutely convergent double 
series yields a result which we find convenient for our present purposes to 
put in the following form. 


(3.6) THEOREM. If F is a countable family and for BEF, Kz is a countable 


370 C. R. ADAMS AND A. P. MORSE [May 


family; whenever B,C F, BEF, BixB2; S =) serKp; and f i 4s a 
function whose domain includes S and which satisfies the condition | > se sf(8) | 
<0 (and therefore > se s\f(8)| < ©), then 

sES 


BEF 


In the rest of this section frequent use will be made of this theorem. 


(3.7) THEorem. If f is quasi-additive on © and BEG, then f is quasi- 
additive on © + B. 

Proof. Let E=oa(@). In case E—B=0 we have G@+B=G+E=G+a(G) 
= @ and the theorem is trivially true. If E-—B +0, we let F be an arbitrary 
element of + B)*, set F’= {E-B} + F, note that F’€ G*, and observe that 
the relation 


| f(6)| =| B)| +| + 16) | 
BEF BEF 


| ber” 
completes the proof. 

(3.8) Lemma. If S is quasi-additive on ©, ACG, BEG, A+B=a(G), 
AB=0, and f is bounded on © +A and on G+ B, then f is bounded on ©. 


Proof. Let E=o(G), 
= sup | Na= sup sup | f(s)|. 
| ser BEGsA 


Let 7’ be an arbitrary element of G, a’=A(E—vy’), B’=B(E—vy’), and note 
that a’B’ =B’y’ =y'a’ =0. If a’+~7’=0, the relation 


fy) | +] +] SN + Nat No 
is apparent; if a’+y’+0 and B’ =0, it is a consequence of the inequality 
| a!) + fly’)| 
and if a’+y’+0 and 6’ 0, it follows from the inequality 
| fla’) + + SN. 
Thus the proof is complete. 


(3.9) Lemma. If f is quasi-additive but unbounded on ©, there exist sets A 
and B with 


AEG, BEG, A+B=<c(G), AB=0, | fA)| >1+4+1 


— 


1943] APPROXIMATING INTEGRALS BY SUMS 


f unbounded on © +B. 
Proof. Let E and N have the meaning assigned them in the last proof. 
Since f is unbounded on © we take By€ G such that 
| | > NV +1+| 
and let Ay>=E—By. Clearly we have By~0 and 
BEG, B= ABo=0, | 
| Ao) | = | | — | (Ao) + f(Bo) | 
From Lemma (3.8) we infer that f is either unbounded on @+ Bp or un- 
bounded on @ « Ag; in the first alternative the desired conclusion is reached 
by taking A =A» and B=B,, whereas in the second case it is obtained by 
taking A = By and 
A proof of a well known theorem on completely additive functions of 


sets(®), which was communicated to us some time ago by R. M. Robinson, 
has been of use to us at this juncture. The following is a generalization of 


that theorem. 

(3.10) THEOREM. If f is quasi-additive on G, f is bounded on ©. 

Proof. We shall show that the assumpticn of the hypothesis and the con- 
trary of the conclusion leads to a contradiction. Let By=oa(@G). In view 
of Theorem (3.7), Lemma (3.9), and Theorem (3.3) there exist sets 
Ai, Ao, As, - and B,, Bz, Bs, --- satisfying the following recursive con- 
dition: if is a positive integer and 

A, G+ B, E G+ A, + B, = A,B, 0, 
\f(An)|>1+|f0]|, f is unbounded on G+ B,, 
then we have 
Anti G+B,, Bari G+ B,, Ant + = B,, Anti Ba+1 = 0, 
| f(Ans1) | >1+ | f(0) |, fis unbounded on 


Clearly we also have 
BoD Bi DB: D:::, 
AmB, C AmBn-1 C AmBu = 0 forlism<n, 


Defining 


Ao = Fon (Ad) 


we see that A,,A, =0 for 0 Sm<n. From this and the fact that 
(5) See, for example, Saks, Theory of the integral, Warsaw, 1937, Theorem (6.1), p. 10. 


371 


C. R. ADAMS AND A. P. MORSE [May 


| > 1+] /(0)| implies A, ¥0 = 1, 2,3,---), 
we infer 
Am # A, forO Sm <n. 


But Fy is a G-partition; hence 


= | = < 


and f(A,) tends to zero as m—, in contradiction to the inequalities 
>1+|f(0)| (#=1, 2,3, -- +). 

(3.11) DEFINITION. A family H is called hereditary if and only if the relation 
BCAEH implies BEH. 

It may be noted that the families 0 and {0} are hereditary. 

For 0<6 @ let H; represent the family of all subsets of the real number 
system having diameter <6; H; is then hereditary. If E is a measurable set 
of real numbers and © represents the family of all measurable subsets of E, 
@ is a o-field and we have 


GH; T;(£). 


(3.12) THEoREm. If G is a o-field, H is an hereditary family, and f is a func- 
tion whose domain includes GH, then the relation 


sup | f(@)|=N< 
BEF 
implies 


sup | < @. 
Cr 


Proof. Since sup 0= — , the theorem is trivial in the case of @** H=0. 
Henceforth we assume @** H0; let E=oa(@); let V be some fixed element 
of @** H; for BEG let 


Ve=VE[BBx 0], Ks= {BB}; 
BEV, 


and let g be a function with 
g(8) = f(BB) for 8B ©. 


BEV, 


The first step in the proof is to show that the function g is quasi-additive 
on ©. For any FE @* let 


W=)D Kz. 
BEF 


372 


1943] APPROXIMATING INTEGRALS BY SUMS 373 


W is then an element of @* and since H is hereditary we infer WE G* + H. 
Moreover, it is readily seen that Ks,Ks,=0 whenever 
and also that {8,8} {B28} =0 whenever BE F, 6: Vp, Vg, Be. Two ap- 
plications of Theorem (3.7) thus yield the relation 


BEF BEVg 


BEF BEVs 


an immediate implication of which is the inequality 


| 


The second step is to establish the existence of a number M with 


for every countable disjointed subfamily D of @H. Using the result of the 
first step and Theorem (3.10) we infer the existence of a number N; with 
| ¢(8)| Ni<@ for 8 € G. 


Let M=N+W;; let D be any countable disjointed subfamily of GH; let 
C=E-—c¢(D); let 
Fy=D+ {BC}, 


BEVe 
and note that « H. We have 


BEVe BEVe 


- + gC), 
sED 


Finally, let 


We then have the desired conclusion: 


BED, 


whence 
ser 


374 C. R. ADAMS AND A. P..MORSE [May 


It may be remarked that if in Theorem (3.12) we let H stand for the 
hereditary family Ec[CCo(@)], then H= 

4. Concerning the behavior of the sums. The sums considered in Defini- 
tion (1.5) have a considerable number of properties which are seen immedi- 
ately and which it is essential to note. Partly for convenience of reference we 


assemble some of these properties in 
(4.1) THEoreM. Let fEL*(E) and } and ¢q, be functions each of whose 


domains includes R(f). 
(i) Since 0< 6,5 5,5 implies T;,(E) CTs, (E), the infrey, [suprer, a | 


of sums of the form 


o[Musf]| Bl or | Mag: f — o[Masf]|| B| 
BEF BEF 


is non-decreasing [non-increasing] as 5 decreases (0<5< ~). 
For 0<is © and FET;(E) we have the following properties of sums: 
(ii) k=1 implies | AB| =| B| for BEF, | AE| =|E|, and 


¢[Mazf]| Bl = Bl; 
BEF BEF 


(iii) 0<$(y) for yER(f) implies 


sup >> ¢[Mtasf]| B| S sup 
Fer,(Z) Ber FET; 


(iv) $y) for yER() implies 
| B| $1([Mas/] | B| ; 


dX [Meas] | ; 
BEF 


(v) (6+¢1)(y) =6(y) +¢1(y) for yER(f) implies 
» (6 + [Musf]| B| = B| + | B| 


and 
inf Bl + inf B| 
FEr;(®) Ber Ber 
< inf S sup DY [Mass]| Bl 
BEF FEr;(ZE) BEF 
< sup sup Bl. 
BEF BEF 
Corresponding properties of limits of sums are: 


(vi) Se(f, E, 1) =Ss(f, E) =S*(f, E)=S*(f, E, 1); 


1943] APPROXIMATING INTEGRALS BY SUMS 


(vii) for yER(f) implies 
S*(f, ¢, E, k) S kS*(f, ¢, BE); 
(viii) $(y) oily) for yER(/) implies 
S(f, , E, k) S $1, E, b), S*(f, E, k) S S*U, $1, E, k); 
(ix) (y) for yER(f) implies 


E, k) + S(f, E, k) = Ss(f, + $1, E, k) 
S*(, + $1, E, k) S*(f, E, k) + S*(, E, k). 


It must be emphasized that when we write, as in the theorem above, 


$[Masf] | B|, 
BEF 


without inf or sup ahead of it, A is understood to be an arbitrary fixed set 
satisfying condition (1.6); whereas, when 


inf or sup ¢[Musf]| Bl 
FET;(E) BEF 


is written, A is understood to vary with F with the freedom allowed by condi- 
tion (1.6). To define a notation that makes this distinction would introduce, 


it seems to us, an unnecessary complication. 
The next four theorems are fundamental for later developments. 


(4.2) THEorEm. If fEL*(E), ¢ is a function whose domain includes R(f), 
and the condition 


sup | ¢[Msf]|B|| < 
Fer;(2) Ber 


implies 
sup | Bl < 
BEF 

Taking into account the remark immediately preceding the statement of 
Theorem (3.12), one sees at once that this theorem is a corollary of Theorem 
(3.12). 

Coro.iary. If fEL*(E) and ¢ ts a function whose domain includes R(f), 
the condition 


- @& < Ss(f, ¢, E) S*(f, ¢, E) < 
implies 0< S*(f, |¢|, E)<@. 


(4.3) THEoreM. If fEL*(E), 0<¢(y) for yER(f), and the con- 
ditin: 


375 


C. R. ADAMS AND A. P.- MORSE 


sup = M< @ 
BEF 


implies the existence of N= N(e) >0 with 


0<N<~, sup o[Msf]| Bl) <e fr HCE-[-N,N]. 
BEF 


Proof. Let F,GT3(Z) be such that 


BEFi 


Clearly F; contains a finite family F, satisfying the same inequality. Let NV 
satisfy the conditions 


0<N<o~, C[-—N,N]. 
For HCE—[-—N, N] and FET;(H), we then have the desired condition. 
(4.4) fEL*(E), pPEC(R(P)), for VERY), 0< 
sup ¢[Mzf]| Bl < ~, 


BEF 


and H is a measurable subset of E, then we have 


lm sup )>> ¢[Msf]| Bl =0. 
|H|-0 FET;(A) BEF 


Proof. Let 
A(B) = ¢[Mzf]| for B€ F 


M = lim sup p> A(B), 
|Hi-0 BE FETS3(H) 


M being finite on account of the hypotheses. If M is >0, there exists 7 >0 
such that | H| <7 implies 
= s 3M/2. 
Let H; be a particular set H and F,€T;(H;) a particular family of sets satisfy- 
ing the conditions 


| Hi|<n/2, > A(B) > 4M/5. 
BEFi 


It is clear that F, can be taken as a finite family. If D is a measurable subset 
of E, we have 


lim for BE F;, | B| > 0; 


|D|-o 


hence, in view of the continuity of ¢ on R(f) and the finiteness of the family 


376 [May 


1943] APPROXIMATING INTEGRALS BY SUMS 
F,, there exists y with 0<y<7/2 such that | D| implies 
> A(B — D) > 4M/5. 


BEFi 
Let H; be a particular set H and F2€T3(H:2) a particular family of sets satisfy- 
ing the conditions 


A(B) >4M/s. 
BEF; 


Let 
G= {B-@;} 


BEFi 


and consider the family G+ F; of disjoint sets. We have 


BEFi BEF; 


BE 
whence 
3M/2= Dd A(B= » A(B) + >> A(B) > 4M/5 + 4M/5, 
BOG 


BEG+F, BEF; 


a contradiction which implies M =0 and completes the proof. 
Combining Theorems (4.2) and (4.4) we obtain the following result. 


(4.5) THEorEm. If fEL*(E), pEC(R(f)), and H is a measurable subset of 
E, the condition 


— © < Ss(f, E) S S*(f, BE) < 


implies 
lim S*(f, |¢|, H) = 0. 
|H|-0 


In other words, the conclusion of this theorem is that S*(f, |¢|, H) is an 
absolutely continuous function of a measurable set HCE. 

5. Sufficient conditions for (1.9). Since (1.9) is trivially true when | E| =0, 
we are justified in assuming | £| >0 in connection with the proof of each 
theorem in this section. The first theorem is an extension of the lemma of RS 


(5.1) THEorEM. If E is essentially bounded, the conditions f& UC(E) and 
SEC(R(f)) are sufficient for (1.9), even with (1.6) replaced by (2.10). 

Proof. We may assume f itself to be uniformly continuous on £; then f is 
bounded on E, and §R(f) is seen to be a bounded closed interval. Thus 


EC(R(f)) implies UC(R(f)); o:f is uniformly continuous on E; and we 
have 


for fos forF 
BEF “B 


~ 


378 C. R. ADAMS AND A. P. MORSE 


Let 7=7(¢, €) >0 be such that 
yu» ERG), | 92 — y2| So imply | o(y:) — $(y2)| S ¢/(2| El); 
and let 5(f, 7) >0 be such that 
ws, x2 © E, | — imply | f(x) — 
For FET\(E), BEF, | B| >0, and x:1€B we then have 
a= inf oLf(x)] Mad: f, oLf(m)] olf(x)] = 8, B—a<¢/(2| 


a= inf f(x) S f(x) S sup f(x) = 4, b—asn, 
zEB zEB 


whence 
(a) | — o[Manf]| — | 

and 

| — o[Masf]|| Bl <<. 

BEF 

REMARK. The relation (1.9) can be established by means of inequality (a) 

on the individual terms of (1.9) only under very special circumstances, such 
as those hypothesized in Theorem (S.1). . That this cannot be done even in the 
simple case of E=I, ¢(y) =y for yER(f), when k is >1 and f has a single point 
of essential discontinuity in J is illustrated by taking f on I to be the charac- 
teristic function of the set Z,[1/2 <x <1]. 


(5.2) THEOREM. If E is essentially bounded, the conditions fEL*(E) =L(E) 
and 6€ UC(R(f)) are sufficient for (1.9). 

Proof. If the interval R(f) is not a closed set, it has a finite endpoint which 
does not belong to it. The hypothesis ¢€ UC(R(f)) implies that lim ¢(y) exists 
as y€R(f) approaches this endpoint. If at each such endpoint ¢@ is defined 
(or redefined) as this limit, we have ¢€ UC(R(f)). Clearly this extension of 
the definition of ¢ (or this redefinition of ¢) does not affect the sums in ques- 
tion; neither does it affect the value of fz $:f, since the set of points 
E E,|f(x)ER(f)] is of measure zero. In other words, we may assume 


UC(R(f)- 
Let 0<a<e/(1+|E]|) and 7=n(¢, «)>0 be such that 


2 ERG), | 91 — y2| So imply | — o(y2)| S 
then we have 
| $(y2) | S«/4+ | /(4n) for E K(f). 
Let K =[a, b] be a bounded closed interval containing all of E except for 


[May 


1943] APPROXIMATING INTEGRALS BY SUMS 379 
a set of measure zero which can and will be neglected; let f be a particular 
function summable on £; and let x; satisfy the conditions 
ess inf f(x) S S ess sup f(x), — < f(x) < 
zEE 
Then f can be defined (or redefined) as f(x:) for x K—E, so that f is sum- 
mable on [a, 5] and the new R(f) is the same as the old R(f). On K the func- 


tion f can be approximated arbitrarily closely in the mean by a function g 
continuous on K and having the property 


ess inf f(x) S ess inf g(x) S ess sup g(x) S ess sup f(x). 
sEK 
Thus we are assured of the existence of a function g satisfying the conditions 
E 


In accordance with Theorem (5.1) let 5=6 (g, ¢, 4/4) >0 correspond to 
¢,/4 for the functions g and ¢. The function ¢:f is measurable on EZ and we 
infer that it is summable on E from the inequality 


| < | | + + — | /(4n) for x € E, 


which expresses the dominance of |o:f| by a summable function. For 
FET;(E) we now have 


( Sor fore] +| | 3] 
+| 21 ) 


< 4/44 lf /(4n) + 41/44+(4/4) | 


+ ha f <atal <e 


For use in proving subsequent theorems we introduce two definitions and 
a lemma. 


(5.3) Derinition. If fEL*(E), >o21J; will be called a normal expression 
for Rf) if and only if the following conditions are satisfied: (1) for eachi, J; is a 


380 C. R. ADAMS AND A. P. "MORSE 


bounded closed interval [a;, b;|CR(f), with 

[a;, bs] C Ji C Sint (¢ = 1, 2, 3,---); 
(2) af inf R(f)=a is finite and aER(f), aEJi ae 2, 3,---); (3) # 
sup R(f) =b is finite and bERY), bEJ; (i=1, 2, 3,---); and (4) Dei: 
=R(f). 

It should be observed that if R(f) is a bounded closed interval, J;=R(f) 
for each 4. 

(5.4) Lemma. Let fEL(E) and >>, J; be a normal expression for R(f). For 
0<5< @ and FET;(E) let GiCF represent the family of sets BE F defined by 
the condition 

Maf ER) Ji (i = 1,2, 3,---). 
Then we have 
lm sup |B|=0. 
ti+o FET; (E) BEG; 

Proof. In case R(f) is a bounded closed interval, R(f) —J; is vacuous and 
B| =0 for each 7. In case sup R(f) = © we have b;= ©. For 
each 7 such that 6;>0, let H;CF represent the family of sets B defined by 
the condition Mzf>b;. Then we have 


which tends to zero with 1. 

In case sup R(f) =b< bER(f), we have b;=b. Let Hi;CF stand 
for the family of sets B defined by the condition Mzf>b;, and let ¢;=b—); 
(¢=1, 2,3, +--+). For any particular i and let 


B, = BE [b —«: < f()], B, = BE [f(x) < b — 2«], 


B; = BE [b — f(x) b — 


so that we have 


| Bl | BI | B| 
= b — «(2| /| +| Bs|/| Bl), 


whence, since ¢; is >0, 


< Maf = 


[May 


APPROXIMATING INTEGRALS BY SUMS 


2| Bs| /| Bl +| Bs| /| Bl <1, 
2| Bs| +| Bs| <| Bl =| Bil +] Bs| +] Bal, 


| <| Bil, 
and 


| B| <2| Bi|+|Bs| 2(| +| Bs|). 
Hence we obtain 


| Bl 2| [b — 26 f(x)]|. 
BER; 


But from the inequality (¢=1, 2, 3, - - - ) we have 


E [b — 2ei41 f(x)] C E [b — 26 f(x)] (i = 1, 2, 3,-++) 


which implies 


Oslim sup |B|S2lim | — f(x)]| 
FET;(E) t+ 


= 2 TI 20 f(a)] = 2|E [6 f(x)]|=0. 


The cases inf R(f)=— © and inf R(f)=a>— a€R(f) are entirely 
similar to those already considered, so that the lemma may now be regarded 
as established. 

(5.5) Derinition. For fEL*(E), pEC(R(f)), 0 SG(y) when yER(f), and 
a normal expression for R(f), we call the sequence $; (¢=1, 2, 3,---), 
where for eacht 

¢(y) for [a;, bs], 


inf for vyER(Y), 
bs stsy 


a normal approximating sequence for ¢ on R(f). 

We note that if R(f) is a bounded closed interval, ¢; is identical with ¢ 
for each 4. It is essential to observe also the following properties of a normal 
approximating sequence: 

UC(R(f)), 

—o(y) = 90 for yEJi, 

0S oy) So(y) for yERY), (¢=1,2,3,---); 
lim ¢i(y) = ¢(y) for yER(/). 


ic eee 
| 
| 


382 C. R. ADAMS AND A. P. MORSE [May 


Since the set of points E E,[f(x)ER(f)] is of measure zero, it follows that 
we have 
0S ¢; < 1, 2,3,--- 

for almost all x € E. 


tim gal = 


(5.7) THEOREM. The conditions fEL*(E) and $EC(R(f)) imply 


Proof. Let ¢; (¢=1, 2, 3, - - - ) be a normal approximating sequence for 
|| on K(f), and let H represent any essentially bounded measurable subset 
of E. Theorems (5.2) and (4.1) (viii) imply 


ff = su, $i, H) = Ss(f, di; H) Sef, | ¢|, H) S(f, lol, E) 
(i = 1, 2, 3,---); 


and by aid of (5.6) and a well known theorem of Fatou we obtain 
stim int foes 
H H 


fi ¢:f| s Se(f, E). 


whence 


The desired conclusion then follows from the arbitrariness of H. 


(5.8) THEorEM. The conditions fEL*(E), 6€UC(R(f)), OS¢d(y) for 
yER(f), and S*(f, E) < © are sufficient for (1.9). 


Proof. If E is essentially bounded, the conclusion follows from Theorem 
(5.2) without use of the hypothesis S*(f,¢, EZ) < ©. Otherwise, this hypothesis 
implies the existence of 7 satisfying the conditions 


(5.9) sup ¢[Msf]| Bl < 
FEr,(Z) Ber 


In accordance with Theorem (4.3) let N;=Ni(€/3k) >0 insure 


sup ¢[Msf]| < «/(3z) for H C E — [— Ni, 
Fer,(4) Ber 


From Theorem (5.7) we infer /z@:f < © ; thus there exists N with 


NisN<o, for<es for H C E— [—N, N]. 


1943] APPROXIMATING INTEGRALS BY SUMS 


Let 
Ey = +1], 


and in accordance with Theorem (5.2) let 5=6(f, ¢, Ew, €/3), 0<5S7, corre- 
spond to ¢/3 for the functions f and ¢ on Ey. For FET;(£) let FiCF stand 
for the family of sets BEF defined by the condition BC Ey, and let 
F,= F—F,. Then we have, by aid of Theorem (4.1) (iii), 


<¢/3+6¢/3 +k sup ¢[Msf]| <«, 


where H=a(F2) CE- [—™,, 
For the proof of the next theorem the following lemma will prove help- 


ful. 
(5.10) Lemma. Let fEL*(E), C(R(f)), OS for yER(f), $, EZ) 


<o, and $; (t=1, 2, 3,---) be a normal approximating sequence for on 
Rf). Then there exists n with 0<n< © such that we have 
lim sup (¢— ¢)[Muasf}| Bl = 0. 
FEY,(E) BEF 
Proof. If R(f) is a bounded closed interval, ¢(y) —¢:(y) =0 for yER(f) 
(¢=1, 2, 3, - - - ) and the conclusion is obvious with any 7 without use of the 
hypothesis S*(f, ¢, E) < ©. Otherwise, this hypothesis implies the existence 
of » satisfying conditions (5.9). In accordance with Theorem (4.3) let 
N=N(e/2k) >n>0 be such that 
sup » $[Msf]| B| < «/(2k) for HC E-—[-VN,N], 


FET,(4) BEF 
and in accordance with Theorem (4.4) let y =y(f, ¢, 7, €/(22)) >0 be such that 
HCE, |H|<y imply sup”) ¢[Msf]| Bl < «/(22). 
FET,(4) BEF 


a 
Let Ey=[—N—n, N+n]E and in accordance with Lemma (5.4) let m be 


such that 
sup > |Bl<y for i > m, 
FEr,(Ey) BEG; 


where G;C F has the meaning ascribed to it in that lemma. For FET ,(£) 
let F,CF be defined by the condition BEF, if and only if [—N, N]B0, 
and let F,= F—F;. For each ¢ we then have, by aid of Theorem (4.1) (iii), 


| 
383 | 
fi 
| 
| 
| 
| 


384 C. R. ADAMS AND A. P. MORSE 


sup > se LY 4) B| 
FEr,(Z) Ber FEr,(E) Ber 


<ksup +esup (@ — Bl, 
FP, BEF: PF, BEF; 
where the second term does not exceed 


ksup >> ¢[Msf]| B| < ke/(2k) = 
F, BEF: 


and for i>m the first term is less than or equal to 
FET, (Ey) BEF FEr,(Ey) BEG; 
<k sup ¢[Maf]| Bl < be/(2k) = ¢/2. 
FET,(Ey) BEG; 
Coro.uary. The hypotheses of this lemma imply 


lim S*(f, od — di, E, k) = 0. 
(5.11) If fEL*(E), 6EC(R(f)), and for yER(f), the 
condition S*(f, , E) < © is sufficient (as well as obviously necessary) for (1.9). 


Proof. If R(f) is a bounded closed interval, the conclusion follows at once 
from Theorem (5.8). Otherwise, the hypothesis S*(f, ¢, EZ) < ©, by Theorem 
(5.7), implies the existence of {x@:f; and it also implies the existence of 7 satis- 
fying conditions (5.9). Let ¢; (¢=1, 2, 3, - - - ) be a normal approximating 
sequence for ¢ on R(f). Then for0<6<n, FET;(Z), and each t we may write 


sf los — oes! + f 
> » | — BI. 


From (5.6) and the inequality [x@:f < © we infer, by Lebesgue’s convergence 
theorem, the existence of a number JN; with 


ff fori > Ni. 


Lemma (5.10) implies the existence of a number N2 N, with 
BEF 


= sup >> (¢— ¢)[Musf]| Bl 
Ber 


[May 
| 


APPROXIMATING INTEGRALS BY SUMS 385 


Ss sup @— [Muasf]| Bl] <¢/3 fori> 
FeEr,(Z) Ber 


Having fixed «> N, in accordance with Theorem (5.8) we let 5=4 (f, gi, €/3), 
0<6<7, be such that the second term on the right in (a) is <€/3. This com- 
pletes the proof. 


(5.12) THEorem. If fEL*(E) and 6EC(R(f)), the condition 
< Sx(f, ¢, S*(f, ¢, E) <0 
ts sufficient (as well as obviously necessary) for (1.9). 


Proof. By the corollary to Theorem (4.2), the hypotheses of the present 
theorem imply S*(f, |p| , E)< For yER(/) let 


o:(y) = [| | + o2(y) = [| | 609) ]/2, 
whence 
o(y¥) = oily) — o2(y), 
and for i=1, 2 we have 
EC(R(f)), 0 S S | | for y ER(/), S*(f, E) S*(f, < @. 


In accordance with Theorem (5.11), therefore, let 5;= 6; (f, ¢:, €/2) >0 corre- 
spond to ¢/2 for the functions f and ¢; (¢=1, 2). Then we have, for 
5=min [6:, and FET;(Z), 


fesr- [Masf] | 


BEF 


BEF!“B 


The following corollary is now evident. 


Coro.iary. For fEL*(E) and ¢EC(R(f)), the conditions (1.8), (1.9), 
(1.10), and (1.11) are equivalent. 


(5.13) If fEL*(E) and pEC(R(f)), the conditions |d(y)| <¥(y) 

for yER(f), convex on R(f), and fap: f < © are sufficient for 

(a) sup | >. < n = diam F, 
Fer,(2)! ser 

and therefore for (1.9). 


1943] 
| 


386 C. R. ADAMS AND A. P. MORSE 


Proof. Using Jensen’s inequality (*) we obtain, for FET,(Z), 


BEF BEF BOF 


| = f ©, 
BEF 


from which follows — © <Ss(f, ¢, E) SS*(f, ¢, E)< ©, the condition hy- 
pothesized in Theorem (5.12). 

It will be shown in the next three sections that the conditions specified 
in this theorem are also necessary for (a), but are not necessary for (1.9). 


Corotiary. If fEL*(E), PEC(R(f)) and 0<¢G(y) for yYER(f), the condi- 
tion } convex on R(f) implies 


sure, E,k) @. 


Proof. The function ¢:f is measurable and non-negative on E. In case 
Je@:f= ©, Theorem (5.7) implies S(f, ¢, E, k)=S(f, ¢, E) =. In case 
Seb:f < ©, the equality follows from Theorem (5.13). 

6. Some lemmas. Throughout this section we assume that E is a measurable 


linear set with 
| Z| =x, 0< 


fEL*(E£); ¢ is a non-negative function whose domain includes R(f) ; and 


sup = M < 
BEF 


We begin by developing a few properties of equimeasurable functions 
which are needed for our special purposes and some of which may not be 
immediately accessible in the literature. 


(6.1) DEFINITION. Of two functions u and v we say that u on A is equimeas- 
urable with v on B if and only if A and B are sets included, respectively, in the 
domains of u and v and the sets 


AE[ui) Sy], BE [v(s) Sy] 


are measurable and of equal measure for —~ <y<o, 
We now set 
(°) See, for example, Hardy, Littlewood, and Pélya, Inequalities, Cambridge, 1934, pp. 


150-152. The reader should bear in mind, here and later on, that the condition y convex on an 
interval implies y continuous on the interior of that interval. 


(May 


APPROXIMATING INTEGRALS BY SUMS 387 


a(y) =| E [f) for— 0 


and on the open interval (0, «) define the function g by the relation 
g(s) = inf E [s S a(y)], O<s<u. 


(6.2) Lemma. On (0, yu) the function g is non-decreasing and equimeasurable 
with f on E. 


Proof. Obviously g is non-decreasing. Let — © <yo< ~, let a=a(yo), and 
set 


b =| E [g(s) 


The following properties may then be readily deduced seriatim: 

(a) S a(g(s)) for0 s <p; 
(b) g(s) S yo implies s S a(g(s)) S a(yo) = a; 

(c) g(s) > yo implies s > a. 


Since g is non-decreasing on (0, ») we have 


sup ( {0} +E [g(s) vel) inf ( +E [g(s) > vel) 


this combines with (b) to yield 6Sa and with (c) to provide b2a, whence 
a=b and the lemma is proved. 


(6.3) Lemma. If D is an open interval and D’' a measurable set with 
DCD’'C(0, pu), and if B’ is a subset of E such that f on B’ is equimeasurable 
with g on D’, then there exists a set BCB’ such that f on B is equimeasurable 
with g on D and f on B’—B is equimeasurable with g on D’—D. 


Proof. Let (a, b) =D, and let 
a’= inf g(d), b’ = sup 
a<t<bdb a<t<d 
Bi, = BE a’], = DE [g(s) 
B,= BE < fi) < g(s) 


A simple check shows that 
Dz = D'E [a’ < g(s) < 


388 ' C. R. ADAMS AND A. P. MORSE [May 
and that f on B; is equimeasurable with g on D;. Clearly we have | D,| <| B;| 
and | D;| <|B;|. By considering the sets 
(— x, x)B, and (— x, x)B, for 0 < zx, 
we find measurable sets 6; and 8, with 
BACB, Bs, | =| Dil, | =| Ds|. 


Thus f on B; is equimeasurable with g on D,; and f on B; is equimeasurable with 
g on D;. Accordingly, on account of the disjointness of the sets involved, we 
may take B=8,+ B+ 83, note that D=D,+D2+Ds, and conclude that f on B 
is equimeasurable with g on D. It then follows at once that f on B’—B is 
equimeasurable with g on D’—D. 


(6.4) Lemma. If u on A is equimeasurable with v on B, then 


fu=fo and |A|=| BI. 
A B 


This well known result is easily seen. 


(6.5) Lema. If @ is a finite disjointed family of open subintervals of (0, u), 


we have 
¢[Mog]| D| < M. 
DEG 


Proof. Let the number of intervals in G be m and let 
= {D,} + {[D.} +--- + {D,}. 


By repeated use of Lemma (6.3) we infer the existence of a disjointed family 
F of n subsets of E such that 


F = {B,} + {Ba} +--- + {By}, 


with f on B; equimeasurable with g on D; (i=1, 2,---, m). From Lemma 
(6.4) we have 


= | =| D,| (¢=1,2,---,m), 


whence 


D| = lato. | Ds| = fll 


t=1 


= ¢[Mef]| B| 
BEF 


(6.6) Lemma. We have R(f) = R(g). 


This result is almost immediate. 


1943] APPROXIMATING INTEGRALS BY SUMS 389 


(6.7) Lema. If (a, 6) ts an open subinterval of (0, wu) and if is a function 
linear on an interval containing the image of (a, b) under g with 


= 
then we have 


f = — a). 


Proof. Let J be an interval which contains the image of (a, b) under g and 
on which y is linear. We have 


= wy — wy + for y EJ, 


where w is a suitable number and y = Ma,)g, whence 


b b 


= $(y)(b — a). 
(6.8) DEFINITION. A family K is irreducible if and only if 
o(K — {C}) o(K) forCe K. 


(6.9) DEFINITION. A finite irreducible family of open intervals is called a 
chain if and only if o(K) is an open interval. 


Concerning chains the following remarks may be in order. No interval in 
a chain is the vacuous set, but the vacuous set itself constitutes a chain. 
The intervals in a chain may be ordered according to non-decreasing left-hand 
[right-hand] endpoints; thcse two orders are the same. Taking alternate 
intervals in this ordering one readily sees that a chain is expressible as the 
sum of two disjoint families each of which is itself disjointed. If K is a chain, 
KiCK, and o(K;) is an interval, then K; is a chain. If the interval farthest 
to the left [right] in a chain is deleted, the remaining family is a chain. If K 
is a chain and C is an open interval which overlaps o(K) but contains no ele- 
ment of K, K+ { Cc} is a chain. 

(6.10) Abbreviations. Let 7 be a positive number fixed throughout the 
rest of this section. We set 


W(D) = (n/2u)| D| + [Mog] | DI, 
R* = {Mong} + the interior of R(g). 


For m a positive integer we employ (m) as an abbreviation for the following 
statement: 
If H is a set with » elements and if 


Mowg 


390 C. R. ADAMS AND A. P. MORSE [May 


then there exists a non-negative convex function Y on — © <y<o@ which 
dominates ¢ on H, a chain K with o(K) =(0, yw), and for each DEK a non- 
negative function Fp on (0, 4), all of which are so related that we have 


f "Fp < W(D) for D € K, 
0 


and 0<s<y implies the existence of a DEK for which these conditions are 
satisfied : 
sED, = Fo(s). 


(6.11) Lemma. (1) ts valid. 


Proof. Let C=(0, nu), H= {Meg} , and w be the non-negative convex func- 
tion on (-— ©, «) defined by 


= [Meg] for-— <y<o, 
Let K= {c} and define Fe on (0, u) by 

Fe(s) = ¢[Meg] forO<s <u. 
We then have 


f Fe = = < WO), 


and 0<s <y implies 
Fe(s) = ¢[Mcg] = ¥[g(s)]. 
(6.12) Lemma. If ¢ is continuous on R(f) and nis a positive integer, the valid- 
ity of (n) implies that of (n+1). 
Proof. Let H be a set with +1 elements and assume 
Miowgs R*. 


Either there is a number in H which is >Qo,,)g or there is a number in H 
which is <Qo,,)g. For definiteness we assume henceforth that the former is 
the case and we designate by v the Jargest such number. It should be clear 
from the nature of the ensuing argument that analogous reasoning can be 
brought to bear on the alternative case. 


Let 
H=H- {vr}, 
so that 7 is a set of m elements and 
Mowe AC 


In view of the validity of (), let vy bea non-negative convex function on 
(—«, ©) which dominates ¢ on H; let K be a chain with o(K) =(0, uy); 


1943] APPROXIMATING INTEGRALS BY SUMS 391 


and for each DEK let Fp be a non-negative function on (0, 4), all of which 
are so related that we have 


(a) for < W(D) forD € K, 


and 
(b) 0 <s <4 implies the existence of a D € K with 
sED,  ¥[g(s)] = Fo(s). 


In the case $(v) <¥(v) it is apparent that (n+1) holds. From now on we 
assume 


(c) V(r) < 
Recalling that Myo,,)g is <v we let & satisfy the conditions 
(d) = v. 


Let C, be the interval farthest to the right in K. By moving only the right 
endpoint of C, a suitably small amount to the left we obtain an interval C: 
which not only enjoys the property that its right endpoint is interior to the 


interval 
(é, o(K {Ci}) 
but also, in view of the continuity of ¢, the property 


f < W(C2). 


Recalling (a) and defining 
(e) Fo, = Fo, Ki= K— {Ci} + {C3}, 


we see at once 


(f) f "Fo < W(D) forD € Ki. 


Our determination of C, insures that K; is a chain in which the interval 
farthest to the right is C,, whence 


(g) $€o(KiC; implies Ce 

Making use of (b), the first relation in (e), and (g) we infer 

(h) s € o(R:) implies the existence of a D € Ki with 
sED, ¥le(s)] = Fo(s). 


= 

{ 

f 


C. R. ADAMS AND A. P. MORSE 


R: = Ki E [DC &.»)], 


K = {(é,»)}, 


we note that K; and K are chains and o(K) = (0, w). Since » is interior to R(g) 
we obtain from (d) the inequality g(t) <v. Clearly there exists a function y 
on (— ©, ©) with 


v(y) = ¥(y) for y < g(é), 
y linear on E [e(é) y], ¥(v) = 


Fp = Fp for D € 
for D=(£, uw) let Fp be defined on (0, uz) by the relations 
Fp(s) = ¥[g(s)] for§<s <u, 
Fp(s) = 0 for0<s 
By aid of (c) we find 
V(r) < = 


from this relation and (i) it follows that ¥ is non-negative and convex on 
(— ©) with 


¥(y) for y S ». 


Thus W dominates ¢ not only on H, since each number in H is <v, but on H 


as well. 
From (k), (d), (i), Lemma (6.7), and Abbreviations (6.10) we see that for 


D=(&, u) we have 
= f Fo = f vig = = 6[Mog]| D| < WD). 
0 
For D=(&, u) the validity of the relation 
] "F < W(D 
> < WD) 


is thus evident; that it is valid for DER: is clear from (f) and the relation 
K.CK;,. From the definition of K we then have relation (1) for DEK. 
For 0<sS£ we have s€o(K2) Co(R:i), and by (h) there exists DEK, with 


sED, ¥[g(s)] = Fo(s); 
for this particular D it is clear that we have 


DEE:CK, g(s) g(é), 


392 [May 
Setting 
Let 


1943} APPROXIMATING INTEGRALS BY SUMS 


and hence, by aid of (i) and (j), 
(m) sED, w[g(s)] = Fo(s). 


The proof that 0<s<y implies the existence of a DEK with property (m) 
is now readily completed by reference to (k). 
The proof of the lemma is now complete. 


(6.13) LemMMA. If @ is continuous on R(f), H is a finite subset of the interior 
of R(f), and n is an arbitrary positive number, there exists a non-negative convex 
function y on (— ©, ©) which dominates y on H and satisfies the condition 


am. 


Proof. Let H* =H+ {Mzf}. From Lemmas (6.4) and (6.6) we have 
= Mzf, R(g) = RY); 
hence, from the definition of R* in (6.10), we see that H* is a finite set with 
Miowg A* CR*. 


Lemmas (6.11) and (6.12) insure the existence of a non-negative convex func- 
tion Y which dominates ¢ on H*, of a chain K with o(K)=(0, yu), and for 
each DEK a non-negative function Fp on (0, yz), all so related as to have the 
properties specified in (m) under (6.10): 


(a) f "Fp < W(D) for D € K, 


(b) 0 < s <y implies the existence of a D € K with 
sED, = Fo(s). 


It was pointed out in the remarks following Definition (6.9) that we may 
write 


K= + Gz, 


where G; is a disjointed family of intervals in K, G, is likewise, and @,G, is 
vacuous. Thus, using (6.10) and Lemma (6.5), we have 


WD) = + DY Wid) 
DEK DE®, 
(n/2u) | + |D| + olMg]| D| 
DEG, DE®, DE®, 
+ [Mog] | D| 
DEY, 
S (n/2u)u + (n/2u)up + M+ M =9+ 2M. 


394 C. R. ADAMS AND A. P. MORSE [May 


The relation 
¥[e(s)] Fo(s) for0<s<p 
DEK 


is an immediate consequence of (b) and the non-negativity of Fp. These last 
two inequalities now combine with (a) to yield 


(c) sf Fo) <2 WO) Sn+2M. 


Since g on (0, z) is equimeasurable with f on E and £,[y(y) Sa] is an interval 
for — «© <a<~, it is readily shown that W:g on (0, uw) is equimeasurable 
with ¥:f on E. Accordingly, from Lemma (6.3) and (c) we have 


Jura [ove <a + om. 


Since y dominates ¢ on H*, it dominates ¢ on H. The proof is complete. 


(6.14) Lemma. If J is an interval and for each positive integer n the function 
Wn is non-negative and convex on J with 


lim sup < © fort EJ, 


then there exists a function Wo non-negative and convex on J and integers N; 
(¢=1, 2, 3, +++) with the properties 


15 lim wy,() = forteJ. 


Proof. Let Jn = [am, bm] (m=1, 2, 3, - - ) be a sequence of closed. inter- 
vals with 


(a) Tm CI mts (m = 1, 2,3,---), 


and let 
C= {Wa}. 


For each pair of positive integers m and n the total variation of y, on J, is 
less than or equal to Wa(@m)+Wa(bm), a number which is also a bound 
for y, on J». Thus there exists a sequence of non-negative numbers M, 
(m=1, 2, 3, - - - ) such that for each positive integer m and each YEC the 
total variation of Y on J,, and the values of ¥ on J,, do not exceed M,. 
From a theorem of Helly(7) we conclude that for each positive integer m the 


(") Helly, Uber lineare Funktionaioperationen, Sitzungsberichte der Akademie der Wissen- 
schaften in Wien vol. 121, IIa (1912) p. 283. 


1943] APPROXIMATING INTEGRALS BY SUMS 395 


family C is conditionally compact in the topology of pointwise convergence 
on J,; that is, each sequence of elements of C contains a subsequence which 
is pointwise convergent on Jp. 

Using a familiar diagonal process of selection and the properties (a) of 
the intervals J,, we see that in the topology of pointwise convergence on J 
the family C is also conditionally compact. Hence there exist integers N; 
(¢=1, 2,3,--+-) with 12N,<N2<N3< - such that 


lim exists (finite) fort J. 


Defining Yo on J as this limit, it follows at once that Wo is non-negative and 


convex on J. 
7. Concerning convex dominants for the function ¢. In this section E is 
to be understood as a measurable linear set with 


| E| = 0 < s 
(7.1) THEoREM. If fEL*(E), > ts a non-negative continuous function on 
Rf), and 


BEF 


then there exists a non-negative convex function y which dominates o on R(f), with 


furs 2M. 


Proof. If the interior of R(f) is vacuous, R(f) consists of but a single 
point; in this case Y may be taken as the non-negative convex function 
¥v(y) =¢(y) for yER(f) and the conclusion is apparent. Accordingly we as- 
sume from now on that the interior of R(f) is not vacuous. 

Let m>0 be such that the condition 


| [— m, m]E| >0 
is satisfied. For each positive integer m let 
E, = [— mn, mn]E; 


let 
{ri} + {ro} + {rs} 


be the set of all rational numbers; let f, be the function defined on E, by 
= for? € E,; 


let 
H, = [interior of R(f.)][{r1} + {re} +--- + {ra}; 


and let h, be the characteristic function of E,. Finally, let 


C. R. ADAMS AND A. P. MORSE 


(a) 


n=1 
We see at once the relations 
(b) lim h,(#) = 1 


(c) R(fn) HA, Cc 


A more careful check shows 


(a) Ki) = 


(e) HD Xf), 


where H represents the closure of H. 
For each positive integer » we have 


sup > $[Mafa] | B| sup 
FEr.(E,) BEF FET .(E) 


and H, is a finite subset of R(f,); by Lemma (6.13) there exists a non-negative 
convex function y, on (— ©, ©) which dominates ¢ on H, and satisfies the 
inequality 


¢[Maf]| B| = M, 
BEF 


f 2M + 1/n. 


Let yER(f). Recalling (c) and (d) we let mo be such that yoER(f,) for 
n=no. Let By be a subset of E,, with Mz,f,,=yvo. From the definition of f, 
and the relations 


E, (m = 1, 2, 3,---) 


follows 
yo = Ma,f = Ma,fn for n = no, 


and, by aid of Jensen’s inequality, 
2M + 1/n 


| for = mo, 


Va(Yo) = fa] S 


whence 
lim sup ¥a(yo) S 2M/| Bo| < 


By Lemma (6.14) we now infer the existence of a non-negative convex 
function y on R(f) and a sequence of integers N; (¢=1, 2, 3, - - - ) with the 
properties 

15 limwy(y)=¥(y) for y E R(/). 


396 | [May 
1,2,3,---). 
n=1 


1943] APPROXIMATING INTEGRALS BY SUMS 397 


Since f(x) ER(f) for almost all x€E, it may be inferred by aid of (b) that 
tim hy = lim dw [f(2)] = 


for almost all x CZ. Using a well known theorem of Fatou we then obtain 


NG 


= lim inf ,: fv, = lim fn; 
Ey; NG 


< lim (2M + 1/N,) = 2M. 


From the second part of (c) and (a) it follows that y dominates ¢ on H. 
From (e), the fact that the convex function y is upper semi-continuous on 
Rf), and the continuity of ¢, we now conclude that y dominates ¢ on R(/f). 

REMARK. In Theorem (7.1) the hypothesis that ¢ is continuous can be 
dispensed with. To show this in detail appears to be a tedious process, but it 
may be indicated roughly as follows. Employing properties of equimeasurable 
functions as given in §6, Theorem (7.1) may be used to establish this result: 

If fEL*(E), o is a non-negative function on R(f), 

sup o[Maf]| = 
FET.(E) BEF 

Hits a finite subset of R(f), and n is an arbitrary positive number, there exists a 
non-negative convex function y on (— ©, ©) which dominates @ on H and satis- 
fies the condition 


2m. 


Using this fact and suitably redefining H,, as it was introduced in the 
proof of Theorem (7.1), we may reach the desired conclusion. It is then easy 
to remove from Theorem (7.3) below the hypothesis that ¢ is continuous. 


(7.2) THEOREM. If fEL*(E), is a non-negative continuous function on 
Rf), and 
Bl) = M < 
BEF 
there exists a non-negative continuous convex function ~ which dominates $ on 


Rf), with 


furs 2M. 


Proof. By Theorem (7.1) there exists a non-negative convex function Yo 


398 C. R. ADAMS AND A. P. MORSE [May 


which dominates ¢ on R(f), with fxWo:f 2M. If the interior of R(f) is vacu- 
ous, Yo may be taken as y. Otherwise let y be the continuous function on R(f) 
which is identical with Yo on the interior of R(f). Since ¢ is continuous and Yo 
is convex, we have 


o(y) S S for y R(f) 
and the conclusion is immediate. 


(7.3) THEOREM. If fEL*(E), 6€C(R(f)), and 


sup > < 
FEr.(Z) | Ber 


there exists a non-negative continuous convex function y which dominates |¢| 


furs 


Proof. Defining the function on R(f) by 
&(y) =| (9) | 


on Rf), with 


we infer from Theorem (3.12) 


sup) ®[Maf]| Bl < @. 
BEF 


The conclusion then follows from Theorem (7.2). 

It is almost superfluous to remark that in Theorems (7.1)—(7.3), if the 
set E is essentially bounded, we have L*(Z)=L(E) and I.(Z) can be re- 
placed by I',(Z) where 7 represents the essential diameter of E. 

8. Certain counterexamples. In order that the question at issue should be 
completely covered, the theorems of §7 above must be supplemented by an 
example to show that for fEL*(Z) and ¢€ C(R(f)) the condition that || be 
dominated by a convex function as specified in Theorem (5.13) is not neces- 
sary for (1.9). In this section we shall give two examples and conclude with 
a few supplementary remarks. 

ExamPLeE A. The following example exhibits a function fEL(J) and a 
non-negative function @¢€C(R(f)) for which relation (1.9) holds but ¢ is 
dominated on by no convex function with fy:f< 

Let ¢: be defined on(— ©, ©) thus: for — © <y<0, ¢:(y) =0; form 21 and 
odd, ¢1(2m) =22)*, g,(2n+2) =22"+®": for n=2 and even, ¢:(2m) 
¢i(2n-+2) =2°2"+2)*; on each closed interval [2n, 2n+2] (n=0, 1, 2,---), 
¢: is linear. Let ¢2 be defined in the same manner except for interchange of 
the specifications when m is odd and when 2 is even. It may be verified at 
once that the slope of ¢; (¢=1, 2) on the interval [2n, 2n+2] is an increasing 


1943] APPROXIMATING INTEGRALS BY SUMS 399 


function of thus ¢; is convex on (— ©, ©). For each let 
¢(y) =min [¢:(y), ¢2(y)], so that ¢ is continuous on (— ~, ~). 

Let J, (n=1, 2, 3,+++) be a sequence of disjoint intervals with 
| Z,| =1/2@"*+*, [2/3, 1] for m odd, and J,C [0, 1/3] for even. Let the 
functions fi, f2, and f be defined on J by 


2n for x € I, and # odd, 
= { 
0 otherwise, 


2n for x € I, and m even, 
fla) = otherwise, 
f(x) = fi(x) + fo(x) forxE I. 


One verifies at once fi€L(J), feE L(J), and hence fEL(J). 

Under the condition diam (B)<631/3, which we now assume to be in 
force, a set B which contains a point of either interval [0, 1/3], [2/3, 1] con- 
tains no point of the other. From the definitions of f,; and f2 it is therefore 


clear that 
fi>0 implies f fz =0 
AB AB 


and vice versa. Thus, for FET;(J), we have 


dX ¢[Masf]| Bl = o[Masfi + | B| 
BEF BEF 
= ¢[Masf]| Bl + o[Masf]| BI, 
BEF BEF 


as well as 


for-fontm = font f on for B EF. 


We may then infer from Theorem (5.13) 
lim sup | Msd:f o[Musf]|| 


&o BEF 


=lim sup > | Ms¢:f: — ¢[Masfs]|| 


0 BEF 


+lim sup | Msd:f2— = 0, 


FET;()D BEF 


since ¢ is dominated by the convex function ¢; with 
f < = 1 (i= 1,2). 
I n=l 


We shall now show that if y is any convex function dominating ¢ on 
(—, ©), /rW:f does not exist (finite). Let » be odd and 21 and consider 
the interval [2n, 2n+2]. On this closed interval ¢: and ¢2 are linear, with 
derivatives that we may designate, respectively, by w: and pe (ui >), and 
their graphs intersect at a point (a, 8), 2"<a<2n+2. On the subintervals 


400 C. R. ADAMS AND A. P. MORSE [May 


[2n, a] and [a, 2n+2] we have respectively ¢ and ¢=¢z. The function 
defined thus on (— ©, ~), 


¢:1(y) for 2” S y Sa, 
linear for — © < ySa, 
¢:(y) fora S y S 2n + 2, 
linear fore S y < @, 


= 


is concave on (— ©, ©), with {(y) SW(y) for — © <y< . Hence there exists 
a function ¢* linear on [2n, 2n+2], with derivative yo and with ¢*(a) =8, 
satisfying the conditions 


Mi = = be, 
¢*(y) S for 2n S y S 2n +22. 


Now a line through the point (a, 8) with slope » cuts the verticals through 
(2n, 0) and (2n+2, 0), respectively, in points whose ordinates are 


hi(u) = B+u(2n—a), he(u) = B+ u(2n+2— a). 


From the condition ¥(y) 2¢(y) 20 on — © <y< , we infer 


fous > f Vif = + + 


> + + 
= + = (uo), 
where 
Since H is a linear function of yu, at least one of the inequalities 
H(uo) 2 H(u1), H(uo) 2 A(us) 


holds. But, since no one of the numbers /;(u;) (¢, 7=1, 2) can be negative, 
we have 


(us) 


2 
> hy 2(2nt1)? /2(2n)*+m = 23nt1, 


and the fact that /ry:f does not exist (finite) follows from the arbitrariness 
of the odd integer n. 


| 
whence 


1943] APPROXIMATING INTEGRALS BY SUMS 401 


Another question which it is natural to raise is answered by 

EXAMPLE B. The following example shows that in Theorem (5.13) the 
hypothesis that \¢| be dominated by a convex function y cannot be weakened 
to the hypothesis that |¢| be dominated by a strictly monotone function y 
even if one takes E=J and insists that f be continuous on J save at one point, 
that no sampling be indulged in, and that the restriction FET;(Z) be im- 
posed. 

Our example is one in which f is non-negative on J and ¢ itself is positive, 
continuous, and strictly continuous on the interval 0<y< ©, with fip:f< @. 
In order that our purposes later in §11 may also be served we construct this 
example to satisfy the additional condition 


o(y)/y = 


To fit completely into the situation of §11 we should extend the interval on 
which ¢ is defined to (— ~, ~) by setting 


o(y) = 16+ for— <y <Q, 


for example, so that ¢@ is continuous and positive on (—«, ©) with 


lim | = ©. 
Let 


a, = 1/2™, = = 1, 2,3,---), 
so that lim,_,.. 2, =0 and for each n we have 


On+1 = a., ms = Qn+1) My = a., Qn+1 < mM, < an ay, = 1/16. 
On I let g be defined by the conditions 
g(0) = 0, g(x) = 1 for 1/16 < x S 1, 
—5/4 
for < S Mp, 
g(x) = 


n = 1, 2,3,--- 
for my, < x S dn, ( 


1 
f(0) = 0, na) = f g forO < «51. 


The function f is then absolutely continuous on every interval [e, 1], ¢>0, 
but the relation 


5/4 —5/4 
f g =m, On+1) = (m, Mn) 
Gn+1 


= ma “(1 — ma) > (15/16)ma 


implies limsof(x)= ©. 
The inequality 


and let 
(nm = 1, 2,3,---) 


C. R. ADAMS AND A. P. MORSE 


may be shown as follows. For arbitrary m let 
@ = Gnit, Cc = Mn, b = a,. 
Then we have 


b 
f(a) = f(b) +f 
= f(b) + (6-0), 


whence 


f f = — + (6 — = (6 — — — «)?/2; 
and 


I(x) = = f(c) + — x) foraS 


fs = (c — a)f(c) + — a)*/2. 


Thus we obtain 


b 
f f= = + - 6 91/2, 


where 
c8/4(¢ — a)? — (b — c)? = — — (6 — 0)? > — — BP 
= — c)? — ¢ > c*/4(15/16)? — ¢ 
= c*/4((15/16)? — ci/4] 
> c*/4[(15/16)? — (1/16)/4] > 0. 


From the definition of g we have 


0 < g(x) S for0 < x31, 
whence 


1 1 
(b) 0 f(x) -f g sf < for0< #31. 


Thus we infer f€L(J). Moreover, as x decreases from 1 toward 0, f is strictly 
increasing as well as continuous, and f increases from 0 toward ~; that is, to 
each y, 0 Sy < ©, corresponds one and only one x with 0<x31, y=f(x). For 
0<y< © we may therefore define ¢(y) by the conditions 


402 [May 


1943] APPROXIMATING INTEGRALS BY SUMS 403 


= 16, ¢[f(x)] = 162-1? forO0< #31. 


Clearly, @ is continuous and strictly increasing on the infinite interval 
0<sy<~, with f ;¢:f < ©. On the other hand, from (a) we have for each n 


(8.1) = 16m, — G41) = 160, (an — 


= 16(1 — a.) > 16(1 — a,) = 16(15/16) = 15, 


which implies 


lim sup f =: 


Ber 
and from (b) 
o[f(x)] = 164-2? = (42-14)? > [f(x) ]? for0< #351, 
which implies 
$(y)/y = 


Finally, we may point out that no real generality is added to Theorem 
(5.13), when diam E< @ and ¢€C((- ©, ~)), by assuming only deferred 
application of the dominance of |¢| by yv or of the convexity of the dominant 
function y. For, if the conditions 


$EC((— ~, ~)), | o(y)| S for0 
— ©, o), 
y convex on ( ) f vif 


are satisfied, let a=maxjy)su|¢(y)|. Then y(y)=¥(y)+a is convex on 
if the conditions 


¢$EC((— | o(y)| S¥(y) on(— ©, ©), WEC((— ~)), 


convex for — ~ <yS MandforMSy<o, fur<e 
E 


are fulfilled, let Then =max [y(y), a] is convex, 
and |¢| is <y1, on (— ©, If $(y) —¥(y) on (— ©, &), it is easily 
seen that [© UC((—~, ~)), whence fEL;(Z) by Theorem (5.2); thus 
— Set sf exists. 

9. The essential nature of the hypothesis of continuity on ¢. The extent 
to which this hypothesis is essential to the result embodied in Theorem (5.12) 
is rather clearly indicated by the following theorem. 


404 C. R. ADAMS AND A. P. MORSE [May 


(9.1) THEOREM. If ¢ is a function on (— ©, ©) and fEBL,(I) implies 


then $(y) ts continuous on (— ©, ©). 


To prove this theorem we shall show that the assumption of the hypothe- 
sis and the contrary of the conclusion leads to a contradiction. For simplicity 
we first consider the case in which ¢(y) has an hypothetical discontinuity at 
y=0, with 


(a) $(— 1) = (0) = ¢(1) = 0, _—limsup¢(y) > a > 0. 
We make these assumptions, and with the object of constructing a function 


fEBL,(J) which will yield a contradiction we proceed presently to establish 
a lemma. For this purpose the following notation will be convenient: 


{i,m} = E[G—1)/2°S 25 j/2"] (w= 1,2,3,---; 7 =1,2,---, 2%), 


sin 
P(f) = E [| f(z) |_> 0]. 


We also define certain sequences of functions as follows. For 21, 
0<£<1, and ACT let h,,; be defined on J by 


1 for j/2" — &/2* x j/2" G = 1,2,---, 2"), 
= for «45 3/4+ £/4, 
0 otherwise; 
let H,,: be defined on (— ©, ©) to be periodic of period 1/2" with 
Hn, = for x 7; 
let 5,,4 be defined on J by 
0 for {j,n} if Afj,n} (f =1,2,---, 2%), 


1 otherwise; 


= { 


and let g,,¢,4 be defined on J by 
Bn, t,A(%) = bn,4(%) An, 2(x) for I, 
so that P(g,,:,4) is a closed set. These functions being defined, we assert 


LemMaA 1. Jf A is a closed subset of I and O<&<1 we have 
(i) gn,t,a(x) =Oforx€ {j,n} if A {j,n} 40 (m=1, 2,3,-- 7=1,2,---,2"); 


1943] APPROXIMATING INTEGRALS BY SUMS 


(ii) for integers m1, nz with nz=m2=1, 
f Eng, = 0 
(iii) (the hypothesis of Theorem 8 being assumed) 
lim San(gn,¢,4) = (1 — | A | )o(€)/2. 
no 


Conclusion (i) is immediate from the definition of 5,,4. To prove the other 
conclusions we regard A and £ as fixed, drop these letters as subscripts, set 


— 1)/2* + j/2"]/2) = (n = 1, 2, 3,- =1,2,---, 2"), 


and proceed as follows. From the definition of h, we have 


fr =o 
I 


whence, for such and j=1, 2, +--+, 2%, 


jn 


f H,(x/2")dx = 
I 


hp(x)dx = 0. 
2" n I 


2 


For m2=m, 21 we then have 


f = f Sn, = 0 


and (ii) is proved. 
As for (iii) we observe first 


lim = lim (1/2") 2" J in| 


no no 


= lim (1/2) {x + > ol in| 


jul 


1 
= (1/2")()2"* + = 
1/2 
by applying the hypothesis of Theorem (9.1) to the function on J which is 
zero for x€[0, 1/2] and equal to hk, for x€[1/2, 1]. Next we note, for 
y=1,2,+++, 2%, 


405 

(nm = 1, 2, 3,---); 


406 C. R. ADAMS AND A. P. MORSE 


jm (y—1)2"41 j= (v—1)2"41 


j= (v—1)2"41 


2" 
= Bn,» > 


= pn,r2"Sa( hn). 


(j,n} im 


Hence for integral m 21, we have 


= (1/2) 2 J 


foe (v—1)2"41 


From the fact that J—A is an open set, the density of the set of points 
j/2” (n=1, 2,3, +--+; 7=0,1, +--+, 2") in J, and the definition of 5, one may 
infer lima. frin(x)dx=1— |.A | . Using also (b) one obtains conclusion (iii). 

Returning now to the proof of Theorem (9.1), let 7 be an arbitrary number 
satisfying the inequality 0 <7 <1, and for each integer let &, be a num- 
ber satisfying the conditions 0<&,<1/2", >a. 

We are ready to begin an inductive definition of a sequence of functions f, 
(n=1, 2,3, +++). Let Ao be the vacuous set; and let Ni 21 be so large that, 
upon defining fi =gw,,t,,49 We have in accordance with Lemma 1 (iii), 


Sevilfi) 2 o(&:)/2 — 1/2. 


For the second stage of the procedure let A1=P(fi); and let N2>2N;, be so 
large that, upon defining f2=gw,,¢,,4:, we have in accordance with Lemma 1 


(iii) and the hypothesis of Theorem (9.1), 
= (1 — | — 1/22, 


| | 1/2*. 
We observe | A;| = &:. 
For the third stage, let A2= P(fi)+P(f2) =Ai1+P(f2); and let V3>2N:2 be 
so large that, upon defining fs = gw,,¢,,4,, we have in accordance with Lemma 1 
(iii) and the hypothesis of Theorem (9.1), 
2 (1 — | Aa] )o(Gs)/2 — 1/28, 
| + fo) | S 


[May 


1943] APPROXIMATING INTEGRALS BY SUMS 407 


The inequality | Aa| <£&:+2should be noted. In general, f; (7 =1,2, - - -,2—1) 
having been defined, let A,1=A,2+P(f,-1); and let N,>2N,_; be so large 
that upon defining f, =fwv,,¢,,4,.-1 We have 

2 (1 — | An| — 1/2", 

+ fa + | S 1/2". 


We note | An-1| + 
Finally we define 


(c) 


F(x) = for «ETI, Pee, 
ful jul 


observing that P(f)=A and that A;CA js: (j=1, 2, 3, - - - ) implies 
|4| = lim | 4;| <>. 
j=l 
For »22 we may write 
n—1 
Lh 

and obtain 


(d) Sev, (F) = > fr) + Sav, (fn)- 


This arises from the fact that for each (j=1, 2,-++, 22%*) we have by 


Lemma 1 (ii) 


and by Lemma 1 (i) 


E 


which together imply 


From (c) and (d) we have 
Sav,(F) = (1 — | An-a| — 1/24 = (1 — | Ana| Jae/2 — 1/2", 


408 C. R. ADAMS AND A. P. MORSE 


whence 
lim Sey,(F) 2 (1 —| A| )a/2 > (1 — n)a/2. 


But from the hypothesis of Theorem (9.1) we obtain 


lim S,(F) = f ov = 0, 
I 
an obvious contradiction. This completes the proof of Theorem (9.1) in the 
case characterized by conditions (a). 

We shall now show how a proof of Theorem (9.1) in the general case can 
be obtained from the particular case already considered. Let 8 stand for the 
class of functions f each of which is measurable on J and assumes no more 
than three values. A function ¢ defined on (— ©, ) will be said to possess 
property (B) if and only if 


f € Bimplies lim (1/2*) >> Bi = f os 


no n 


The result obtained above may now be formulated thus: 

(i) If has property (B), with (—1) =p(0) =$(1) =0, then lim supy.o+ 
is 
We assert 

(ii) A necessary and sufficient condition that o(y) be continuous for all y is 


that it possess property (B). 

The necessity of the condition may be inferred at once from Theorem 
(5.2). To prove the sufficiency we first verify without difficulty that if ¢: 
and ¢2 have property (B), so also do ¢:+¢2 and c¢i(ay+b), where a, b, and ¢ 
are arbitrary real numbers. Now let ¢ be assumed to have property (B) and 
to be discontinuous at an arbitrary point y,; then ¢:(y) =¢(y—) has prop- 
erty (B) and is discontinuous at y=0. Also, for — © <y< @, let 


$2(¥) = 1) y(1 — y)/2 + 61(0)(y + I)(y — 1) + (1 + 9)/2, 


so that ¢e2, being continuous, has property (B). The function ¢3=¢:+¢2 then 
has property (B), with ¢3(—1) =¢3(0) =¢3(1) =0, and ¢s is discontinuous at 
y =0. Of the four functions +¢3(+,), at least one is a function ¢4 possessing 
property (B), with ¢4(—1) =¢4(0) =¢4(1) =0 and with lim supy.o, >0, 
in contradiction to (i). 

This completes the proof of Theorem (9.1). 

REMARK. In connection with Theorem (9.1) it is worth noting that the 
conditions ¢€C((— “, ~)), fEBL,(J) imply the existence of a function y 
convex on (— ©, ©) and depending upon f, with | (¥)| <y(y) on (— 
and fEL,(J). It would be of some interest to prove or disprove either of the 
following conjectures. (1) The hypothesis of Theorem (9.1) implies that to 


[May 


1943] APPROXIMATING INTEGRALS BY SUMS 409 


each f€L,(J) corresponds a function convex on (— ©, ©), with |o(y)| <v(y) 
on (— ©, ©) and fEL,(J). (2) The hypothesis of Theorem (9.1) implies the 
existence of a function y convex on (— ©, ©), with |o(y)| <y(y) on (— @, ~) 
and L4(J) CL,y(J) (that is, L4(1) =L,(J)). Attempts which we have made in 
this direction have met with no success. 

10. Concerning the replacement of hypothesis (1.6) by (2.10). In this sec- 
tion we propose to determine as definitely as possible under what circum- 
stances the sample sets need only be of positive measure; that is, the “k-hy- 
pothesis” (1.6) can be relaxed to (2.10). A basic, though simple, result of this 
kind is embodied in Theorem (5.1). In RS it has been seen that, in case E=J 
and ¢(y) =y for all y, the question turns entirely on whether f€ R(J), the class 
of functions Riemann integrable on J. Thus it is natural to restrict the con- 
siderations of this section to the case in which E is a bounded interval, say 
E=I. 

We begin by establishing 


(10.1) THEorem. If fEL(1), o:fEL(1) —R(D), and (1.6) is 
replaced by (2.10), 
lim  $[Muasf]| 
BEFET;() 
does not exist. Thus, when o:fEL(I)—R(J), the k-hypothesis is indispensable 
to Theorems (5.2)—(5.13). 


Proof. However small 5>0 may be, let FET;(J). If ¢:f is not essentially 
bounded on J, there is at least one interval BE F on which ¢:f is not essen- 
tially bounded; that is, for arbitrary M >0 there exists a measurable set ACB 
with 

|A| > 0, | o[f(x)]| > M forxC A. 


The set 


C = BE [f(x) ER(/) for f on B] 


is of measure zero. Hence A(B—C) is of positive measure, and there exist 
points x; with 


for f on B; 
that is, there exists a measurable set A1CB with 
|¢[Mas]| > 


Thus the sum in question is unbounded. 
Secondly, if @:f is essentially bounded on J and BEF, let 


ess inf ¢[f(x)] = J, ess sup ¢[f(x)] = L. 
zEB 


410 C. R. ADAMS AND A. P. MORSE [May 


The same reasoning now suffices to show that the sum in question can be 
made arbitrarily close to either the essential lower or essential upper Darboux 
sum for ¢:f corresponding to the family of intervals F. Hence the lims..o of 


the sum does not exist. 
We next proceed in the direction of obtaining a favorable result when 


fER(T). Without the k-hypothesis, even the conditions fEC(J) and 
¢€C(R(f)) are shown to be insufficient for the existence of f; :f, and hence 
insufficient for (1.9), by the trivial example 


f(x) for «EI, =1/y for (0, 1) = R(/). 
The nature of Theorems (5.8)—(5.12) suggests that we consider assuming 
(10.2) fERD), lim sup |¢[Muasf]|| Bl < @. 
3D BEF 


Fer, 
We assert, however, 
(10.3) THEOREM. The conditions (10.2) are equivalent to the conditions 


fERD), ¢€ BCR). 


Proof. It is evident that this pair of conditions implies the set (10.2). On 
the other hand, if ¢=B(R(f)), let a€R(f) be an endpoint of R(f) in every 
neighborhood of which ¢ is unbounded; and let F be an arbitrary family 
€T;(J) for any 5>0: Then at least one interval BEF has the property 
that the ess inf, or ess sup, of f on B is a. Since a€R(f), it follows at once 
that as AB ranges over the subsets of B having measure >0, Dtazf assumes 
all values in a unilateral neighborhood of a. Hence ¢[Muazf]| Bj is an un- 
bounded function of AB, and the sum in (10.2) is unbounded for each 6>0. 

As a preliminary step in the desired direction we next establish 


(10.4) THEOREM. The conditions fE R(I) and $EC(R(f)) are sufficient for 
(1.9), even with (1.6) replaced by (2.10). 


Proof. We may assume f itself to be an element of R(J), with f(x) ER(f) 
for xEJI. R(f) is a bounded interval, whence $6€ UC(R(f)). Thus to €>0 cor- 
responds 7 =7(¢) >0 such that the conditions 


yu» ERA), | — <0 imply | — < €/6; 
and we then have 
| — | < €/6 + €| 91 — y2| /(6n) for y1, y2 K(f). 
Let g and hk be functions continuous on J and satisfying the conditions 


g(x) f(x) k(x) for min ERY), max ¢(2) ERS), 


min A(x) ERS), max h(x) ERS), f (kh — g) 
zEl zEl I 


1943] APPROXIMATING INTEGRALS BY SUMS 411 


Let 5; = 6:(h—g, ¢1, 7) >0 correspond to 7 for the functions h —g and ¢:(y) =y 
(— «© <y< ©) in accordance with Theorem (5.1); let 5:= 52(h, ¢, €/6) >0 cor- 
respond to ¢/6 for the functions h and ¢ in accordance with Theorem (5.1); 
and let 6=min 5]. For FETs(J) we then have 


= | Mad: f — o[Manf] || BI 
BEF “B BEF 
+ 2 | — | BI. 


On the right the first term is 


J oral <6 tef al Hon 


< = ¢/3. 


By the choice of 6 (S62) the second term on the right is <¢/6. The third 
term is not greater than 


+ Mask — Masf| /(6n)} | B| 


¢/6 + €/(6n) D> | Mas(h — g) — Ma(hk — g)|| Bl + o/(6n) — 28); 
BEF I 


which by the choice of 6 (<4) is less than 
«/6 + [¢/(6n) ]n + €/6 = ¢/2. 


The proof is now complete. 
For use in demonstrating the next theorem the following lemma will be 


convenient. 
(10.5) Lemma. Let fER(1), BC(R(f)), <M < @, 
a = inf f(x), B= sup f(x), B-—a>Q0O, 
El zEl 


[sER(/)], 
0<7<(6—a)/2, and 


[2 


To corresponds n=n(€)>0 and 5=5(n)>0 such that if FET;(I) and F, 


412 C. R. ADAMS AND A. P. MORSE [May 


represents the subfamily of F in which each set B has the property | BE,| >0, 
we have Bl <e/M, whence 
| Med:f|| Bl <e and | || Bl <«. 

BOF 


BEFi 
Proof. We have0 =| E.[f(x) =a]| =lim,.o| E,| ; let satisfy the conditions 
0<n<(B—a)/2, <e/M. 


We may assume f itself to be an element of R(J). Any point xCE,—E,, 
where E, represents the closure of E,, is then a point of discontinuity of f; 
and since the discontinuities of f are a set of measure zero, we have | Z,| 
=|E,| <e/M. Let 

I- E, = > 


where >.;{0,} is a countable set of disjoint open intervals. Let k be a finite 
number of these intervals with the property 


> | 0;| >1—é«/M. 
t=1 


From each end of each O; (¢=1, 2, +--+, %) delete a semi-open interval to 
leave an open interval O/, in such manner that we have 


k 
0f|>1—«/M; 


and let the measure of the smallest deleted interval be 6 (6>0). ForFET;(J) 
and F,= F—F, we then have 
k 


> |B) = | B| <«/M, 


BEF; i=1 BEFi 


| Mse:f|| Bl < Me/M =«, | ¢[Mtasf] || B| < Me/M =e. 


(10.6) THEOREM. The conditions fE R(I) and PEBC(R(f)) are sufficient for 
(1.9), even with (1.6) replaced by (2.10). 


Proof. Let f itself be an element of R(J). If R(f)=R(f), the conclusion 
follows from Theorem (10.4). If a=infzer f(x), B=supzer f(x), B—a>0, 
atR(f) [BER(f)], and 0<<(6—a)/2, let 


fle) for « € E [f(x) <a +n] 
= for € E [f(2) > B — 


and for other values of x let f,(x) =f(x). One easily sees 


hERD, Rf) =RF)CRY, # SCRA). 


1943] APPROXIMATING INTEGRALS BY SUMS 413 


Let 7 and 6,>0 correspond to €/9 in accordance with Lemma (10.5); let 
5. = 52(f,, 6, €/9) >0 correspond to €/9 for the functions f, and ¢ in accordance 
with Theorem (10.4); and let 5=min [6;, For FET's(Z) we then have 


Mad:f — ¢[Manf]|| 
| Mao: f — Msd:f,|| + | Mb: f, — || 
BEF BEF 


+ | — || 
< |Mse:f||Bl/ + | Mag: /,|| Bl + «/9 
BEF: BEFi 
+ | + | Bl 


where F; has the significance attached to it in Lemma (10.5). 
The question of whether the k-hypothesis can be relaxed under the con- 


ditions 
fELD-RM, ¢€ 
is still open. That it sometimes can is illustrated by the trivial example 
SEL, ¢(y) = const. for <y< om, 


That it sometimes cannot, even if ¢ is a very simple convex function and f is 
bounded, may be shown as follows. Let ¢(y)=|y| for —1<yS1; let ECI 
be a non-dense perfect set, with | B| =1/2, and let f(x) =1 for «EE, f(x) = —1 
for I—E. For each m (m=1, 2, 3, - - ) let be the family of 2” 
equal subintervals of J. In each one of these intervals there exists a subset 
of I—E of measure >0. In each one of at least half of these intervals there 
exists a subset of E of measure >0, and hence a subset AB with |AB| >0, 
such that M4zf=Jazf=0. In any one of the remaining subintervals the maxi- 
mum absolute value of the integral mean of f on any subset is 1. Hence we 
have 


lim inf >> ¢[Musf]| Bl s /2<1= f os 


BEF» 
This example being in hand, some interest may attach to 


(10.7) THEOREM. The conditions fEL (I), ¢:fER(1), convex on and 
(1.6) replaced by (2.10), imply 


lim sup ¢[Masf]|B| fos < 
Fer,(D I 

Proof. We may assume ¢:/f itself to be an element of R(J). Let € be an 
arbitrary positive number and let g be a function continuous on J and satisfy- 
ing the conditions 


C. R. ADAMS AND A. P. MORSE 


In accordance with Theorem (5.1) let 6=65(g, ¢1, €/2) >0 correspond to ¢€/2 
for the functions g and ¢:(y) =y (— © <y< @). Using Jensen’s inequality we 
then have, for FET;(J), 


BEF BEF BEF 


<fete2<foste 


In the example last cited above f is in the class L(J) —R(J) because of 
the fact that, though bounded, it has discontinuities at a set of positive meas- 
ure. This gives rise to the question: do there exist examples in which it is 
impossible to relax the k-hypothesis when fEL(I)—R(J) because f is un- 
bounded, even though f is improperly Riemann integrable on J, ¢:fER(J), 
and (1.9) holds with the k-hypothesis in force? Should no such example exist, 
it would probably be possible to enlarge the class of cases already determined 
by Theorem (10.6) in which the k-hypothesis can be relaxed by obtaining an 
affirmative answer to some such question as the following. Do the conditions 


fER(, ¢€ 


imply that the k-hypothesis can be relaxed? 
The answer to this question, however, is negative, as one may see from 


the ensuing example. Let 
0 for y= 2(2* — 1) (n = 0,1,2,---); 
= for y = — 1) — (m = 1, 2, 3,---); 
linear between consecutive values of y for which ¢ is already defined ; 


2(2"— 1) for 1/[4(2"+! — 1)?] < x s 1/[4(2* — 1)?] 


0 for 1, x= 0; 
(m = 1, 2,3,---). 


Then we have fE R*(J), since it is a non-negative function dominated on J 
by fi where fi(x) =1/x'/? for 0<x 31, fi(0) =0; @:fE RJ), because it vanishes 
identically on J; R(f) =R(f) =E,[0 sy < ]; and UC(R(f)), for its graph 
consists of the upper sides of an infinite sequence of similar triangles all with 
bases on the y-axis. If now FET: (J) is the family of 2" equal subintervals of J, 
the terms of the sum 


BI 
BOF 


414 [May 


1943] APPROXIMATING INTEGRALS BY SUMS 415 


are all non-negative and the value of the term containing the interval 
B=[0, 1/2*] can be made arbitrarily large by a suitable choice of A. Thus 
the lims.o of the suprer,rn of such sums is infinite. 

In the foregoing example ¢ is unbounded. It is natural therefore to ask 
if the situation is the same when ¢ is bounded; in answer we have theorems 
as follows. 


(10.8) THEOREM. The conditions fER*(I) and $6€BUC(R(f)) are sufficient 
for (1.9), even with (1.6) replaced by (2.10). 


Proof. To e€>0 corresponds 7 =7(e) >0 such that the conditions 
ER), | — imply | — | < €/16, 


whence 


| — o(y2) | < €/16 + €| 91 — y2| /(16n) for y1, y2 R(Y). 


Let M satisfy the condition |¢(y)| <M< © for yER(f). 

Let DCI be the set of points defined by the condition x CD if and only if 
ess lim sups2|f(x’)| = ©. Clearly D is closed, with | D| =0. Let the open set 
I-—D be constituted of the disjoint open intervals O; (¢=1, 2, 3,---), 
and let m be such that >>%,|0;|>1—e¢/(8M). From each end of 0; 
(¢=1, 2, 3, - + +, m) delete an open interval of length €/(16Mm) and desig- 
nate the remaining closed interval by F;, so that }-", | F,| >1—€/(4M). On 
each interval F; the function f is essentially bounded. Let Fy be the truncated 
function f specified in the definition of the condition fER*(J) as given in §2, 
N being chosen to satisfy the conditions 


f 
I 


Finally, let 5; = 5,(f, €/8) >0 correspond to ¢/8 for the functions fy and ¢ 
in accordance with Theorem (10.6); let 52 =€/(16Mm); and let 6=min [6:, 62]. 
For convenience let F/ represent the closed interval obtained from F, 
(¢=1, 2, +--+, m) by deleting from each end of F; a semi-open interval of 
length €/(16Mm), so that we have y F! | >1—3e/(8M) and | | 
=1-— ml F! | <3e/(8M). For FET;(J) let F, represent the subfamily of F 
defined by the condition BE F, if and only if B D2 ~0;and let F,= F—F,. 
For FET;(J) we than have 


| Mad: f — Bl 
s f lo:f—¢:fv| + D> | Med:fv — o[Masfw] || 
I 


+2 | — || Bl. 


416 C. R. ADAMS AND A. P. MORSE 
On the right the first term is less than 
€/16 + ef | f — fv | /(16n) < €/16 + €/16 = «/8. 
I 


By our choice of 6 (<4,) the second term on the right is <e/8. The third 
term may be written 


BEF, BEF, 


Since 6 is S64, each set BEF, contains only points of DF: whence, by 
choice of N and fy, 


| ¢[Maafv] — B| = 0. 


As for the remaining sum, from the relation )\ser,BCI—>_2,F/ we infer 
| — S = 3¢/4, 


which completes the proof. 


(10.9) THEOREM. The condjtions fER*(I) and $EBC(R(f)) are sufficient 
for (1.9), even with (1.6) replaced by (2.10). 


Proof. Let M satisfy the condition |¢(y)| <<M< © for yER(f); and let ¢; 
(¢=1, 2, 3, -- + ) be a normal approximating sequence for ¢ on R(f), as de- 
fined in (5.5). Let sets O;, F;, and F/ be defined as in the proof of Theorem 
(10.8). Let the positive integer N satisfy the conditions 


2M | Ev| where Ey = E[| f(x)| 
ess sup | f(x)| < V 
Let 6:=4:(f, on, €/8)>0 correspond to ¢/8 for the functions f and 
on BUC(R(f)) in accordance with Theorem (10.8); let 5.=¢€/(16Mm); and 


let 5=min [4;, 2]. For FET;(Z) and Fi, F; defined as in the proof of Theorem 
(10.8) we then have 


BEF 
+2 | dv [Mass] ¢[Masf]|| B| 


Be": BEF, 


1943] APPROXIMATING INTEGRALS BY SUMS 417 


By the choice of N, each term in )> ser, vanishes and the sum )> ser, is not 
greater than 2M3e€/(8M) =36¢/4. 

11. Generalizations of boundedness, continuity, bounded variation, and 
absolute continuity of a function. The principal object of this section is to 
sketch in a background behind certain well known properties of the L, (p21) 
classes of functions, which will throw these properties into rather clear relief. 
For our present purposes % will be taken as the unit interval J, f as a function 
whose domain includes J, and ¢ as a non-negative function whose domain 
includes the set of numbers 


(11.1) [f(b) — f(a)]/(6 — a), whereO 1. 


The results to be obtained will lose only a little of their significance if ¢ is 
regarded as a fixed function whose domain includes the interval (— ~, ~), 
while the function f is left free to vary; and this case is probably the one of 
greatest interest. 


(11.2) Derinition. The function f shall be said to be o-bounded on I if 
and only if there exists a number M with 


—a 
(11.3) Dermition. For aC the function f shall be said to be $-continuous 
at a tf and only if the condition 


(a) lim |i: a| = 0 


z€1,z-4 


ts satisfied; f shall be said to be o-continuous on I if and only if f 1s $-continuous 
at each aC I, uniformly $-continuous on I if and only if condition (a) holds 
uniformly with respect to a for a€l. 


(11.4) DEFINITION. The ¢-total variation of f on I is 
— (x31) 


Xj-1 


|@- 


N 
T;(f) =limsup ¢ 
6-0 j=l 


<a =1, 6 = max (2; — 2;-1). 


If and only if Tf(f) is < ©, f shall be said to be of ¢-bounded variation on I. 


(11.5) DEFrniTION. The function f shall be said to be g-absolutely continuous 
on I if and only if to each €>0 corresponds a 5=5(e)>0 such that, [a;, b;] 
(a;<b;;j7=1, 2,- +--+, N) being any finite set of closed subintervals of I disjoint 
except perhaps for common endpoints, the inequality >. ;(b;—a;) <6 implies 


C. R. ADAMS AND A. P. MORSE 


k (63) — f(a; 


b; — a; 


L¢ 

Several consequences of these definitions are plain. (I) The defining prop- 
erties here employed reduce to those of ordinary boundedness, continuity, 
and so on when ¢(y) = ly| for — © <y< o, (II) If f is ¢-absolutely continu- 
ous on J, it is uniformly ¢-continuous on J and of ¢-bounded variation on J. 
(III) If ¢ is bounded on the set of numbers (11.1), f is ¢-absolutely continuous 
on J; in particular this hypothesis will be satisfied if f satisfies a Lipschitz 
condition (of order 1) on J with Lipschitz modulus M and ¢ is bounded on 
the closed interval [—_M, M]. 

The observation that @ continuous on (— ©, ©) and f continuous and 
o-continuous on I do not imply f uniformly $-continuous on I or f ¢-bounded 
on I is justified by the following example. Let 0 </< @ and g(x) =/x for xEl. 
Let [a,, bn] (an<b,;=1, 2, 3, - - - ) be an infinite sequence of disjoint closed 
subintervals of J with the properties },=1, 


<e 


lim 5, = 0; < Gn, m, +13 marr (mn = 1, 2,3,---), 


where 
tin = [Ibn — (— lan) ]/(bn — Gn) = + Gn) /(bn — Gn). 
Let 


0 for x=0, 
f(x) = lb, for x= d,, 


—la, for x = an, (wn = 1, 2,3,---), 


and on each closed interval [an, [bn41, @n] (n=1, 2, 3, - ) let f be linear. 
Let 6€C((— ©, «)) be arbitrary save for the satisfaction of the conditions 


o(m,) = 1/(bn — an) (n = 1, 2,3,---). 


Then f is easily seen to be continuous and ¢-continuous on J but not uni- 
formly ¢-continuous on J. If ~, (n=1, 2, 3, - - - ) is any infinite sequence of 
numbers with 


Mn < Pn < (n = 1, 2,3,---), 


it is clear that @ can be so defined on the points p, (with limy,.. 6(Pn) = ©) 
that f will not be ¢-bounded on J. 

In the subsequent discussion we shall have occasion at times to sub- 
ject @ to one or more of the following conditions: continuity (usually on 
— <y<), convexity (usually on — ~ <y< 0), 


(11.6) lim inf 6(9)/| y| > 0, 


418 [May 


1943] APPROXIMATING INTEGRALS BY SUMS 


(11.7) jim 4(9)/| = 


the last two being suggested, respectively, by the particular cases $(y) = | y| , 
$(y) (P>1) on — <y<o, 

The function ¢(y) = |y| » (p>1) is convex and satisfies (11.7). For this 
function ¢ it is well known(*) that ¢-bounded variation of f implies ¢-absolute 
continuity of f, so that the two conditions are equivalent; also that a neces- 
sary and sufficient condition for f to be the indefinite integral of a function g 
with gE L,=L, is that f be ¢-absolutely continuous. It seems to us of some in- 
terest to determine (1) whether these properties are enjoyed by a more 
general class of functions L,, and (2) whether convexity of ¢ or the satis- 
faction of (11.7) by ¢ is the more important contributing factor in bringing 
about these properties. 

One may easily prove 


(11.8) Lemma. The condition (11.6) is equivalent to the condition that there 
exist positive numbers a, b with a+bd(y) = ly| for —x<y<o, 

Using this lemma one obtains at once 

(11.9) THEOREM. If ¢ satisfies (11.6) and f is -bounded [or $-continuous, 


or of o-bounded variation, or o-absolutely continuous] on I, f is bounded [or 
continuous, or of bounded variation, or absolutely continuous, respectively] on I. 


More generally, in fact, if ¢:1, 62 are such that there exist positive numbers 
a, b with a+b¢i(y) =¢2(y) for — © <y< ¢:-boundedness or ¢;-continuity 
or and so on of f on J implies the corresponding ¢2-property of f on J. 


(11.10) THEOREM. If @ satisfies (11.7) and f is of -bounded variation on I, 
f is absolutely continuous on I. 


Proof. Let Tf(f) =B; let 5:>0 be such that 


> (x; — < 2B for 6 < 63; 


Xj Xj-1 


j=l 
let M=M(e)>0 be such that 3B/M<e; and let Q2=Q(M)>0 be such that 
$(y)/| ¥| > for | y| 2 Q. 


Then we have 
MQ + ¢(y) > for— <y<o, 


If the norm 6 of the intervals [a;, 5;] is <6:, we have 


(*) See, for example, Titchmarsh, The theory of functions, Oxford, 1932, pp. 384-386. 


C. R. ADAMS AND A. P. MORSE 


— flay) 
2B > (b; — a5) 


> — fla) | — M0; — 
= MD | fb) — f(a) | — MOD (6; 2); 
j j 


and if >> ;(b;—a,) is <B/(M@Q), we obtain 
M — fla) | < 2B + MQB/(MQ) = 3B, 
j 


fla) | < 3B/M <«. 


REMARK. If ¢ is convex on (— ©, ©) and satisfies (11.7), and f is abso- 
lutely continuous and ¢-continuous on J, it does not follow that f is uniformly 
¢-continuous or of ¢-bounded variation on J. This is shown by the example 


o(y)=y? on (— ©, 
1/2" for x= 1/2", 
0 for 1/2" + (n = 3, 4,5, +++); 
f(x)=4 0 for x= 0,1; 
linear on each closed interval between consecutive points for 
which f is already defined. 
(11.11) THEoREM. If ¢ is continuous on the closure of the set of numbers 
(11.1) and f' exists (finite) almost everywhere in I, we have [,:f' S T?(f) S ~. 
Proof. Let S, (n=1, 2, 3, - - - ) represent a sequence of points of subdivi- 
sion of J, 
S,: O= < = 1, 
with 
lim 5, = 0, 6, = max (%n,; — %n,$-1), 
Ne Xn, 7 Xn, 
= tim off — 


ne 


— 3-1). 


Xn, j—-1 


For each m let », stand for the polygonal function inscribed in f with 
pa(x) =f(x) for xES,, so that we have 


Ti(f) = lim | ¢:p/. 
I 


420 [May 
and 


1943] APPROXIMATING INTEGRALS BY SUMS 421 


Since f’ exists almost everywhere, p, tends to f’ almost everywhere(*) in J. 
The function ¢ being continuous and non-negative on the closure of the set of 
numbers (11.1), a well known theorem of Fatou yields 


fors lim < @, 
I ae I 


the desired conclusion. 

Remark. The hypothesis of continuity on ¢ cannot be deleted from this 
theorem. For the example ¢ non-measurable on J, f(x) =x?/2 for x EI, shows 
that /r¢:f’ may not then exist. And the example ¢(y)=1 for — © <y<0, 
¢(y) =0 for 0<y<, f a non-decreasing singular function constant on no 
subinterval of I, yields fr ¢:f'=1>0=T77(f). 

(11.12) THEorem. If f is absolutely continuous on I and $C C(R(f’)), the 
condition S*(f’, ¢, I)< © implies that f is p-absolutely continuous on I and 
Sro:f' 


Proof. The hypotheses imply the existence of 65 satisfying the conditions 


0<5<1, sup >, $[Mzf’]| B| < 
FEr;() BEF 


For 6;—a;<6 (j=1, 2, 3, - - - ) we have 


j— 


< sup ¢[Mz/’]| Bl, 
FEr;(4) Ber 


where 


H= ae [a;, 


By Theorem (4.4), this sup tends to zero with | H| ; hence the first conclusion 
follows. The hypotheses also imply, by Theorem (5.11), 


jal 


— Xj-1 
From Theorem (5.13) we obtain the 


Coro.iary. If f ts absolutely continuous on I and $6EC(R(f’)) ts domi- 
nated by a convex function y with fr ~:f'< ©, the same conclusions may be 
drawn. 


(*) See, for example, Titchmarsh, loc. cit. pp. 385-386. 


422 C. R. ADAMS AND A. P. MORSE 


Another immediate consequence of Theorem (5.11) is 

(11.13) THEOREM. Under the hypotheses of Theorem (11.12) or its Corollary, 
if k is =1 and the intervals [a;, b;] satisfy the conditions 

a; [xe1, 2], O<a;— =1,2,---,M), 


we have 


(11.14) Ti(f) = lim > (xj — 


jul b; — a; 


The next theorem is an application of Theorem (10.9). 


(11.15) THrorem. Jf f is absolutely continuous on I with f'ER*(I), 
BC(R(f’)), and the intervals [a,;, b;] satisfy the conditions 


a;,b;E O<b;- a; (f =1,2,---,M), 
we have (11.14). 
The next three theorems are immediate consequences of Theorems (11.9), 
(11.10), and (11.11). 
(11.16) THEOREM. If @ is continuous on (— ©, ©) and satisfies (11.6), 
a sufficient condition for f'EL,(J1) ts that f be of ¢-bounded variation on I. 


Proof. By Theorem (11.9), f is of bounded variation on J and f’ conse- 
quently exists (finite) almost everywhere in J; since f is of ¢-bounded varia- 
tion, Theorem (11.11) yields ¢:f’ ST}(f)< ©. 

Similarly we obtain 

(11.17) THEOREM. If ¢ is continuous on (— ©, ~) and satisfies (11.6), @ 
sufficient condition for 


(1.18) ELM, fa) = 0) + f for 


is that f be d-absolutely continuous on I. 


(11.19) THEOREM. Jf ¢ is continuous on (— ©, ©) and satisfies (11.7), 
a sufficient condition for (11.18) is that f be of d-bounded variation on I. 


Proof. By Theorem (11.10), f is absolutely continuous on J; and from The- 
orem (11.11) we have /r@:f’ ST}(f)< @. 

That no one of the sufficient conditions given in Theorems (11.16),(11.17), 
and (11.19) is necessary is shown by Example B of §8. For, let the function 
called f in that example be designated now by f’ and let f be defined on J by 


f(x) = f for 8 
0 


[May 


1943] APPROXIMATING INTEGRALS BY SUMS 423 


Defining ¢(y) =16+? for — © <y<0, we obtain a non-negative ¢ which is 
continuous everywhere and satisfies (11.7). And although f is absolutely con- 
tinuous on J, with f’EL,(J), it is not of ¢-bounded variation on J in virtue 
of inequality (8.1). 

The following example achieves the same result in a different way, ex- 
hibiting a non-negative ¢ which is continuous everywhere and satisfies (11.7), 
and an f which is absolutely continuous on J, with f’EL,(J), but has an 
“infinite ¢-discontinuity” at x=0. 

Let g(x) =x? for so that g’EL,(J). Let O0Sa<6 31; and let h(x) 
for x [a, b] be linear with h(a) =g(a), h(b) =g(b). Simple computations then 
yield 


ab) 
E + an + + (643 — < — atl?) 


(11.20) 


< (8/3)(08 — = 2 f 


In view of the fact that g is concave on J with g’ increasing continuously and 
monotonically toward © as x decreases from 1 toward 0, there exists a se- 
quence of numbers a, (n=1, 2, 3, - - - ) decreasing monotonically toward 0 
and satisfying the following conditions: 


1, 


0 < < 


(a1) — g(a2) 
= < =m 
a, — Ge a 


1, 


11.21 


Gn an 
(m = 2, 3,4,--+). 


0< Gnt1 < Gn, Mn~1 < L, 


Each of the sequences {/,}, {m,} then increases monotonically toward © 
with » and between each pair of consecutive numbers from either sequence 
occurs one and only one number from the other sequence. Hence there exists 
a sequence of closed intervals {J,} with the following property: for each 
n, I, has m, as its midpoint and J, contains neither /, nor /,4:. That is, the 
set )*_,J, contains no point of the sequence {/,}. 

Let f be defined on J as follows: f(0) =0; for each n, f(x) is linear on the 
interval [aa41, dn} with f(@n41) =g(@n41), f(@n) =g(an). From inequality (11.20) 
we infer f’EL,(J). Since f is monotone and continuous on J, and on each in- 
terval [e, 1] (0<e<1) it is polygonal and so absolutely continuous, it is ab- 
solutely continuous on J. 

For y not in let =y?; for y=m, (n=1, 2,3, - -), let o(y) =y*; 
in each closed half (that is, left and right half) of each interval J,, let ¢ be 


424 C. R. ADAMS AND A. P. MORSE [May 


linear. Then @¢ is continuous for — © <y< © and satisfies (11.7). We also 
have, by aid of (11.20), 


an an an 

nl n=l Gn +1 n=l I 

On the other hand if x tends to 0 over the sequence {a,}, we have 


n) — f(0 
(dn 0) = an = 1/a,° 
a, — 0 
From the Corollary to Theorem (5.13) we obtain at once 


(11.22) THEOREM. Jf f is absolutely continuous on I and @ is convex and 
continuous on K(f’), we have fr ~. 

Using Theorem (11.12), we obtain the 

Coro.iary. If f is absolutely continuous on I, o is convex and continuous 


on R(f'), and f is of o-bounded variation on I, then f is o-absolutely continuous 
on I. 


Theorem (11.22) implies that if f is absolutely continuous on J and ¢ is 
convex and continuous on R(f’), we have 


(11.23) Tr(f) = lim The(f) © (0<e< 1). 


That the assumption of convexity on ¢ is essential for (11.23) may be seen 
from the following example, which exhibits an f which is absolutely continu- 
ous on J and a ¢ which is continuous everywhere and satisfies (11.7), but f has 
a “finite ¢-discontinuity” at x =0, with lim sups.0 o[(f(x) —f(0))/x]x=1; and 
for each positive 


f is ¢-absolutely continuous on [e, 1], 


Tienlf) < 4-242,  Tr(f) = Thoa(f) = @. 


Let g(x) =x"/? for xEJ, so that g’ELs(J). Let O0Sa<b<1; and let h(x) 
for x€ [a, 6] be linear with h(a) = g(a), h(b) =g(b). Simple computations pro- 
vide the inequality 


= 4 gi/2)1/2 


= 20 


— gilt) < 2(p1/4 — 
(11.24) 


1943] APPROXIMATING INTEGRALS BY SUMS 425 


As in the example last described, g is concave on I with g’ increasing continu- 
ously and monotonically toward © as x decreases from 1 toward 0, and there 
exists a sequence of numbers { an} satisfying conditions (11.21). This time 
let I, (n=1, 2, 3, - - + ) be a sequence of closed intervals with the following 
property: for each n, J, has J, as its midpoint and J, contains neither m,_1 
My. 

Let f be defined on J as in the preceding example, so that it is absolutely 
continuous there. From (11.24) it is clear that f’EJZs(/). For y not in 
Un, let =y?; for y=], (n =1, 2, 3, - - ), let =y*/?; in each closed 
half (that is, left and right half) of each interval J,, let ¢ be linear. Thus @ 
is continuous for — © <y< © and satisfies (11.7). 

On the interval [e, 1], f satisfies a Lipschitz condition; let M=M(e) 
satisfy the inequality 


| — f(*2) | S M| x1 — for x1, %2 [e, 1]. 


On the interval [0, M], ¢ is bounded; hence ¢ is dominated on the interval 
«) by a convex function y, with From the corollary to 
Theorem (11.12) we therefore infer the ¢-absolute continuity of f on [e, 1] 
and, in view of (11.24) also, the relation 


For 0<x 1 we also have f(x)/x21 and 
olf(x)/x]x [f(x)/x]*x [g(x)/x]*x = 1, 


whence 
lim sup ¢[f(x)/x]x < 1; 


and if x tends to 0 over the sequence {a,}, we have 
=1 (mn = 1,2,3,---), 
which implies 
lim sup ¢[f(x)/x]x = 1. 
z—0 
To show the property 7§,.;= © (0<e<1), we choose a particular se- 


quence {an}; namely a,=1/22*-) (n=1, 2, 3, ---). This clearly satisfies 
the principal inequality of (11.21), 


< [g(dn) — — < 


which reduces to 


4-2U2, 


C. R. ADAMS AND A. P. MORSE 


< + ans) < 
since we now have 
+ = + 1/2” = (1/2” )(1/2 + 1/4) < 1/2" = 
From the obvious inequality 
[e(an) — g(@n+2)]/(@n — Gny2) < g(Gn)/Gn = Mn, 


we infer that a line drawn from the midpoint P,4; of the segment of f standing 
over the interval [ans2, dn4a], with slope m,, meets the segment of f over 
[an41, @,] at a point Q, to the left of the midpoint P, of that segment. Thus if 
Cn, dy are the abscissae of Pasy:, Qn, respectively, it is clear that the intervals 
[cas d,,| (n=1, 2,3, - +--+) are disjoint, with 


(dn — > dn — Cn > Ont — Cn = — 


But for each n we have 
[f(dn) — ]/(dn — = in = = 1/ay = 


(tn)(dn — Cn) > (Mtn) (Gn41 — Ong2)/2 = — = 3/32. 


Since d,—C, is <(€,—@Gni2)/2, which tends to 0 with 1/n, and the intervals 


[cn, dn] (n=1, 2, 3, - - - ) are disjoint, it follows that in any interval [0, e], 
and for an arbitrary norm 5>0, the sum whose lim sup is 7}, 4(f) can be 
made arbitrarily large by a suitable choice of points of subdivision of [0, e]. 

We conclude with two more theorems which show that well known prop- 
erties of the L, (p21) classes of functions hold for certain more general Ly 
classes. 


(11.25) THEOREM. If ¢ ts convex on (— ©, ©) and satisfies (11.6), a neces- 
sary and sufficient condition for (11.18) is that f be -absolutely continuous on I. 


The sufficiency of the condition follows from Theorem (11.17), the neces- 
sity from the Corollary to Theorem (11.12). 


(11.26) THEOREM. If $ is convex on (— ©, ©) and satisfies (11.7), a neces- 
sary and sufficient condition for (11.18) is that f be of p-bounded variation on I. 


The sufficiency is a consequence of Theorem (11.19); the necessity follows 
from Theorem (11.22). 


Brown UNIVERSITY, 
PROVIDENCE, R. I. 

THE UNIVERSITY OF CALIFORNIA, 
BERKELEY, CALIF. 


ON SOME SINGULAR MONOTONIC FUNCTIONS WHICH 
ARE STRICTLY INCREASING 


BY 
R. SALEM 


1. Acontinuous non-decreasing function f(x) defined for 0 Sx <1 (f(0) =0, 
f(1) =1) and which is purely singular, that is to say, which has the property 


df/dx = 0 


almost everywhere, may be constant in every interval contiguous to a perfect 
set of measure zero: it is usually said, in this case, that f(x) is of the Cantor 
type. There are, however, monotonic continuous functions, purely singular, 
which are increasing in the strict sense, that is, f(x’) >f(x) whenever x’ >x. 

While the existence of functions of the Cantor type is almost intuitive 
and their construction is immediate by successive approximations, the exist- 
ence of strictly increasing singular functions lies deeper. Actually, if we except 
Minkowski’s function ?(x), of which we shall speak later (and whose singular- 
ity is by no means obvious), no simple direct construction of such functions 
seems to be known. Functions of this type usually have been obtained by 
“convolutions” of functions of the Cantor type and the proof that they are 
singular strictly increasing functions is somewhat difficult(?). Thus, it seems 
to be of interest to give simple direct constructions of strictly increasing singu- 
lar functions. 

2. Let us consider, in the plane, the straight line PQ joining the point P 
of cartesian coordinates x, y, to the point Q of cartesian coordinates x+Ax, 
yt+Ay, Ax>0, Ay>0O. Let Ao, Ai be two numbers, essentially positive, such 
that Ao+A1 =1 (Ao*A:). Let us now consider the point R whose coordinates are 


x + Ax/2, y + Addy, 


that is to say, the horizontal distance between P, R or between Q, R is Ax/2, 
while the vertical distance between P, R is \pAy, and the vertical distance 
between R, Q is \; Ay. If we replace the straight line PQ by the broken line 
PRQ, we will say that we perform on PQ the transformation T (Ao, A:). 


Presented to the Society, November 28, 1942; received by the editors October 8, 1942. 

(?) See for example Jessen and Wintner, Distribution functions and the Riemann seta func- 
tion, Trans. Amer. Math. Soc. vol. 38 (1935) pp. 48-88 and particularly p. 61; Kershner and 
Wintner, On symmetric Bernoulli convolutions, Amer. J. Math. vol. 57 (1935) pp. 541-548; 
Wiener and Wintner, Fourier-Stieltjes transforms and singular infinite convolutions, Amer. J. 
Math. vol. 60 (1938) pp. 513-522 and particularly p. 521. For earlier examples see Denjoy, 
J. Math.Pures Appl. 1915 pp. 204-209 (which was the first example given); Sierpinski, Giornale 
di Matematiche vol. 54 (1916) pp. 314-334; Rajchman, Fund. Math. vol. 2 (1921) pp. 50-63. 


427 


428 R. SALEM ; [May 


Definition of a function f(x). Let now fo(x) be for OSxX1 the function 
equal to x, that is to say represented by the straight line OA joining the origin 
O to the point A(1, 1). Let us perform on OA the transformation T (Ao, A:). 
We get a broken line consisting of two straight lines and representing an in- 
creasing function fi(x). Let us perform on each of those two straight lines the 
transformation T(Ao, \1). We get a broken line consisting of 2? straight lines 
and representing an increasing function f2(x). Proceeding in the same way we 
get after p operations a function f,(x) strictly increasing (f,(0) =0, f,(1) =1) 
represented by a polygonal line consisting of 2” straight lines, the vertices 
having for abscissae the points k/2? (k=1, 2,-++,2?—1). 

Putting - 


(1) max (Ao, Ai) =p 


we have essentially, by our hypothesis, 1 <1, and it is immediately seen that 
| — fo| Su. 


Thus f,(x) converges uniformly to a continuous function f(x) (f(0)=0, 
f(1)=1). This function f(x) is strictly increasing because for every p the 
vertices of the curve y=f,(x) belong to the curve y=f(x): thus if f(x) was 
constant in some interval, there would be a p for which two different vertices 
of y=f,(x) would have the same ordinate, which is impossible. 

The ordinate of the vertex of y=/,(x) whose abscissa is given by 


+ 62/22 + --- +6,/2? (0; = 0 or 1) 


is given by 


and thus, by continuity, if 

(2) x = 0,/2 + 0,/27 + --- +0,/27+--- 

we have 


the series being obviously convergent. If x has two different dyadic develop- 
ments, the formula (3) gives for f(x) the same value. 
Let us remark also that if x and x’ >x have the first p digits of their dyadic 


developments identical, and equal to 61, 62, - - - , 0), then 
(4) f(x’) 7 f(x) < Ye,» 


This is seen immediately by the formula (3) or geometrically. 
Proof that f(x) is singular. We shall now prove that the function f(x) is 


singular. 


1943] SINGULAR MONOTONIC FUNCTIONS 429 


It is well known that almost all numbers in (0, 1) are “normal” in the 
scale of 2, that is, are such that 


+ +0, = p/2 + o(p) when 
Let N be the set of these normal numbers. We have meas N=1. Let 


us fix an x belonging to N. Let x be given by (2). p being a positive integer 
the number x+€,4:/2?+!, where 


if =0, 
1 if = 1, 
has a dyadic development whose first p digits are the same as for x, that is, 


Hence by (4) 


Now «x being normal 
+6, = p/2 + 
with |¢(p)| /p—0 when p—o. Hence 


hence 

(5) f(x + — f(x) | < 

Now Xo and A; being essentially different and \9+A: being equal to 1, we have 
< 1. 


This, together with lim |¢(p)|/p=0 proves that the second member of (5) 
tends to zero for p= ©, and thus, if f(x) has a derivative at the point x, this 
derivative cannot have a value different from zero. But by a classical theorem 
f'(x) exists and is finite almost everywhere, hence almost everywhere in N. 
Hence, also almost everywhere, f’(x) =0 which proves our theorem. 

Modulus of continuity of f(x). The vertical distance between two vertices 
of abscissae k/2”, k-+1/2” being less than u” where yp is defined by (1), we have 
immediately that if 1/2?+!sx’ —y<1/2? then 


Hence f(x) satisfies a Lipschitz condition of order | log p| /log 2. 
Fourier-Stieltjes coefficients of f(x/2m). Let 


430 R.SALEM [May 


We divide the y-axis by the points of subdivision corresponding to the ver- 
tices existing at the pth stage of the construction of the function and we ob- 
serve that the vertical distance between the vertex whose abscissa is 


+ 02/2? + --- + 0,/27) 


and the following one, which can be written 


2r(0,/2 + 02/2? + + 6,/27 + > a/29), 


e=p+1 


has the value ¢,\o, - - - Xe. We thus get for approximate expression of the 
integral 


the summation being extended to the 2” combinations of the values 0 and 1 
of the 6;. This sum is equal to 


Il [ro + | 
kel 
and thus, making p= ©, we have 
cn = [do + 
kewl 


We can also write 


kel k=l 


or, putting 


= eT] [cos (wn/2*) + ir sin (xn/2*)] 


that gives 
| |? = Il [cos* (#n/2*) + r? sin? (rn/2*) 


If we take n =2”, we have 
| com |2 > r* cos? (4/4) cos? (x/8) cos* (4/16) - - - 


and thus c, does not tend to zero for n= ©. 

3. Generalization of the preceding function. Instead of constructing our 
function with an infinity of identical transformations T(Ao, A1), let us change 
the transformation used at every step of the construction. 

Thus Fo(x) being equal to x in (0, 1) let O be the point (0, 0), A the point 


k=l 
k=l 


1943] SINGULAR MONOTONIC FUNCTIONS 431 


(1, 1) and let us perform on OA the transformation TQ, Af). We get a 
broken line consisting of two straight lines and representing Fi(x). On each 
of those two straight lines we perform the transformation T(®, d?); on 
the 2? straight lines constituting F:(x) we perform the transformation 
TA®, rA®) to get Fs(x), and so on. 

Let 


We assume that —1<7,<1 for every k and that if we put 


the series > uw» converges. (This is certainly the case, for example, if —a<r,<a, 

0<a<1, but can be secured under less stringent conditions.) Then there is 

no change in the argument used in §2 to prove that F,(x) tends uniformly to 

a continuous function F(x) strictly increasing from 0 to 1 in the interval (0, 1). 
In the same way as before we prove that if 


(6) = 0;/2 + 02/22 +6,/27+--- 
we have 

(7) F(2) = + do” + 

Finally, if x and x’>x have the same first p digits in their dyadic develop- 
ment, we have 


where 
if 6, = 0, if 6, = 1. 


It will be useful to observe that if x is given by (6) we can also write 
= (1/2)(1 — earn) = (1/2)(1 — 


where { x(x) } denotes the system of Rademacher’s functions (k=1, 2, + - - 
Thus, with this notation, the inequality (8) is written 


F(x!) — F(a) < (1/2) II (1 — 


We can now prove the following theorem. 


THEOREM. The function F(x) is purely singular when, and only when, the 
series diverges. 


kal 2 


432 R. SALEM [May 


We shall make use of the following theorem, due to Zygmund(?): for al- 
most all x we have 


lim inf >> — rigi(x) = — @ 
1 


if the series }_72 diverges. 
We deduce immediately from this result and from the inequality 


1- rubs (x) th ok (z) 


that for almost all x 


(9) lim int T] (1 — raga(2)) = 0 


gee k=1 
provided that 
The proof of the first part of our theorem is now immediate. Taking an x 
belonging to the set E (meas E=1) for which (9) holds, we have 


| F(a + — F(x) | < (1/2*) (1 — 
k=l 


hence 
lim-inf 27+" | F(x + (€p4:/2?+4)) — F(x)| = 0 


and if F’(x) exists it is equal to zero. The proof is completed as above. 

To prove the second part of our theorem let us suppose that }°73< «. 
We know by a classical theorem that in this case the series } rigi(x) converges 
in a set E of measure 1. From this and from the hypothesis r< ec it is 
easy to deduce that the infinite product 


(10) II (i — 


is convergent when x belongs to €. Fixing an x belonging to €, let us remark 
that €,4: having the same signification as above the dyadic developments of x 
and x+€,4:/2°+! have all their digits equal except the digits of rank p+1. 
From this it is easy to deduce, for example geometrically, that 
| (1) (p) €p+1 
— < + — F(x) 
1+] (>) 


(*?) Zygmund, On lacunary trigonometric series, Trans. Amer. Math. Soc. vol. 34 (1932) 
p. 435. The proof given there for lacunary trigonometric series is immediately applicable to 


Rademacher's functions. 


n 


1943] SINGULAR MONOTONIC FUNCTIONS 433 


Now the first part of this inequality together with the convergence of the 
product (10) shows that 
Ep+1 
tim inf ( F(2)| > 0. 
Hence, whenever F’(x) exists for xC€, F’(x) is not zero. Remembering 
that F’(x) exists and is finite almost everywhere, we have that F’(x) <0 al- 
most everywhere, and thus F(x) cannot be purely singular. This completes 


the proof of the theorem. 
Modulus of continuity of F(x). The argument is the same as before. If 


1/27+1 < x’ — x < 1/2?, 


we have 


1 
F(x’) — F(x) < nl, 
Thus if w(6) is the modulus of continuity, we have 


thal) 


w(6) << 4 


Fourier-Stieltjes coefficients of F(x/2m). There is no change in the argument 
used above for f(x) to prove that if 


x 
Cn = f (=) 
0 2x 


we have for approximate expression of c, 


the summation being extended to the 2” combinations of the values 0 and 1 
of the 6;. We get thus 


k=1 


k=1 


‘TI [cos (wn/2*) + ir, sin (n/2*)] 


= II [cos’ (xn/2") rk sin’(xn/2') |. 
k=1 


d 
=e 
and 
2 


434 R. SALEM [May 


It is immediately seen—as in the case of f(x)—that if r, does not tend 
to zero, we have | c,| 0(1). 

4. The Minkowski function ?(x). This function was defined by Minkow- 
ski(*) for the purpose of establishing a one-one correspondence between the 
rational numbers of (0, 1) and the quadratic irrationals of (0, 1). The prop- 
erties of the function have been recently investigated by Denjoy(*) who has 
proved that it is purely singular and given other important properties and 
generalizations of Minkowski’s function. 

We propose to give here some new indications about this function, con- 
cerning particularly its modulus of continuity and its Fourier-Stieltjes coeffi- 
cients. For the sake of completeness we shall give the definition of the func- 
tion and the proof of its singularity. 

Definition of the function ?(x). We define first 


?(0) = ?(0/1) = 0, ?(1) = 2(1/1) = 1. 


We next take the “mediant” 1/2 =(0+1)/(1+1) of the two Farey fractions 
0/1 and 1/1 and we define ?(0+1/1+1) to be the arithmetic mean between 
?(0) and ?(1), that is, 1/2. 

We define in the same way 
0+ ) _ (0) + 2(1/2) 


= = 1/4, 
1+2 2 


(1/3) = 2( 


= 3/4, 


1+ _ 2(1/2) + ?(1) 


2 


Generally if, by this process, we have defined ?(p/g) and ?(p’/q’) for two 
consecutive irreducible fractions p/g, p’/q’, we define 


(P22) 20/9) + 

At the mth stage the function is defined for 2*+1 values of x and the 
ordinates corresponding to these values of x are of the form k/2* 


(k=0, 1,2, - - - , 2"). The definition of ?(x) for every x follows by continuity. 
Let now x be a rational number put in the form of a finite continued frac- 


tion: 


% = (do, , Gn), a@=0 (0S 1). 


Let po/go, Pn/Qn=x be the successive convergents 
pi/m=1/ai, - - +). Let us assume that at a certain stage (the mth) of the 


(*) H. Minkowski, Gesammelte Abhandlungen vol. 2 (1911) pp. 50-51. 
(*) A. Denjoy, C. R. Acad. Sci. Paris vol. 194 (1932) pp. 44-45 and J. Math. Pures Appl. 


vol, 17 (1938) pp. 105-151. 


| 
} 
‘ 


1943] SINGULAR MONOTONIC FUNCTIONS 435 


construction of ?(x) the fractions pr_2/gs-2, Pe_s/qe-1 are Consecutive. (This 
happens certainly for o/go=0 and p:/q:=1/a:, 1/a; appearing as consecutive 
to 0/1 at the (a:—1)th stage of the construction.) Let 


= = ). 


+ 2 


Now it is well known that (p:-1+2-2)/(qx-1+9@e-2) is irreducible and thus 
at the next stage (the (m-+1)th) the fraction (irreducible) (24-1+:-2) 
will appear with 
+ (Yea + Yr-2)/2 
2qu-1 + 2 


By definition 


Continuing in the same way we see that 
qk + 
Ve-1 Ve-1 Ve-1 


hence, if we put = ye, 
Ye = (1 — (1/2%)) yea + Ye-2/2% 
Ye — Yer = — (1/2%)(ye-1 — 


Now ?:/qe when it appears is consecutive to px-1/ge-1. Hence we can repeat 
the argument, and if y, = ?(f./gn) = ?(x), we have 


Yn — = (— 1/2%)(— - (— 1/2%8) (91 — Yo). 


Now yo=0, hence 


om = (— 
Ya — Yat = (— 1) ++ 


and thus 


1 1 1 
Ja 2 (e1te2)—1 + 2 ( 1) +-+an)—1 


Now by continuity we get the following result: if 


436 R. SALEM 


we have 


1 1 1 
(11) ?(x) = — + .---+(- . .. 


2 (a1te2)—1 2 (ait ++ ++an)—1 
and it is easy to see that if x is rational, the two different developments of x 
give the same ?(x). 

From this we deduce the more elementary properties of ?(x), namely: 

If x is rational, ?(x) is of the form k/2* (k, s integers). 

If x is irrational, the dyadic development of ?(x) is infinite. 

If x is a quadratic irrational (0, a1, a2, - - - ) is periodic and thus ?(x), being 
the difference of two periodic dyadic developments, is rational. 

It is not difficult to see that the reciprocals of these results are true. 

The fact that ?(x) is strictly increasing is an immediate consequence of its 
construction. 

Proof of the.singularity of ?(x). Let x =(0, ai, ---,@n,--+-). We know 
that for almost ali x lim sup a, = ©. Let NW be the set of such numbers (meas. 
N=1) and let us fix an x belonging to N. Let x=(0, ai, ---,@n,---), let 
?(x) =y and let 


bn/ Gn (0, an), Pn = 
and let us write, as usual (@n41, Gni2, ). We have 


+ | Pn 1 
? 
On419n + Qn + Gn—1)9n 


and thus 
1 | 1 
< 


< 
(@n41 + qn 


1 1 


which gives 


| — Pn 2 (ait ++ ++ant1)—1 


Hence, we have 


2 
= < 2(Gn41 + 
Qait 


5, = 


[May 
and 
1 1 
and 


SINGULAR MONOTONIC FUNCTIONS 


6 = 


2 
Y — Pn-1 | OnQn—1 


Consequently, 


bn 1 2 a \? 
an Qn—1 
1 a. 2 n 
( + Joon + < 


Denti 


<2 


Qanti 


C being an absolute constant. 
Now we can certainly find an infinite subsequence {a,,} of the {a,} such 
that an, <@n,41 and @,,— ©, hence 


lim inf 6,/5,-1 = 0. 


Now if dy/dx exists, is finite, and is different from zero at the point x, 
5,/5,-1 must tend to 1. Hence, at any point xCN, dy/dx cannot exist, be 
finite, and be different from zero. But dy/dx exists and has a finite value al- 
most everywhere. Then the only possible conclusion is dy/dx =0 almost every- 
whc.e, which proves the singularity of the function. 

Modulus of continuity of ?(x). We need the following result on continued 
fractions, which to our knowledge has not been stated: 


LemMA. Let fn/gn=(0, a1, d2,--*, Gn). Let @ be the Fibonacci number 
(1/2)(5¥2+1). We have the inequality gn ten, 


We shall prove this lemma by induction. We have g:=a,<6* for it is 
easily seen that m<6™ for every positive integer m. We have also go=1=6°. 
And we have generally gi + u-2(k =2, 3, - - - , m). If supposing the 
lemma true for »=k—2 and »=k—1 we prove that it is true for n=k, we 
will have proved the result as stated. Let 


It is sufficient to prove that 


that is, a,0%4-+1 <0%4+ or a,+1/6%4 0%. Hence it is sufficient to prove 
that a,+1/6<0%. Now for a,=1 we have the equality 1+1/0=@ and it is 
easy to see that 2+1/0 =6? and that the function 6*—x increases when x22. 
Hence the lemma is proved. (It is easy to see, by considering the number 
(0, 1, 1, 1, - - - ) that this result is the best possible of its kind, in order of 
magnitude.) 

We can now proceed to determine the modulus of continuity of ?(x). In 
the definition of the function by successive approximations, we start from the 


1943] 437 
4a 


438 R. SALEM . [May 


Farey fractions 0/1 and 1/1 and in a first operation we introduce the mediant 
1/2, in a second operation the two mediants 1/3 and 2/3, in a third operation, 
four mediants, and so on. In the pth operation we introduce 2?-! mediants 
and we get a sequence of fractions containing 


2+ (1+24+--- = 2°41 


fractions, which we can call the Minkowski sequence of order » and denote by 
M,. To the sequence M, corresponds, by the transformation y= ?(x), the se- 
quence of numbers k/2? (k=0,1,2, - - - ,2”). The formula (11) giving the value 
of ?(x) shows that the fractions belonging to J?, are those which, when written 
in the form (0, a1, a2, - ++, @,), are such that b> does not exceed +1. 
Hence, by the lemma, if a/8 belongs to It,, we have 8 <6?+*, and this order 
of magnitude is actually attained for the fraction (0, 1,1, - - - , 1) where 1 is 
repeated +1 times. Now it is immediately seen, by induction, that if a/8, 
a’/B’ are two consecutive fractions of Mt,, we have |8’a—Sa’| =1 and thus 
the distance between two consecutive fractions of Mt, is greater than 1/6??+?. 

Let now x, x’ be two irrational points of (0,1), y=?(x), y’=?(x’). Ata 
certain stage of the dissection one fraction x9 appears for the first time in 
(x, x’). Let us continue the dissection until one fraction appears for the first 
time in (x, xo) or in (xo, x’) or in both intervals. Let this stage of the dissection 
be the pth, then we have x’ —x>1/6?*+? and y’ —y <4/2”. Hence, 


1 1 
(2p + 2) log 6 > log , . (p — 2) log 2 < log | — 
x 


which proves that 
C being an absolute constant, and this relation being true for every couple of 
irrationals x, x’ is also valid for x or x’ or both rational. Hence: the function 
?(x) satisfies a Lipschitz condition of order a=(1/2)log 2/log 6 where 0 is the 
Fibonacci number (1/2)(51/2+-1). 
We shall now prove that a@ is the best possible exponent for the Lipschitz 


condition of ?(x) and that it cannot be improved. 
Let us consider, in fact, the number 


x = (0,1, 1,---) = (5? — 1)/2 = 1/8. 
The corresponding value of the function is 
n= (x) = = 2/3. 
Let ~,/g, be the successive convergents of x. It is well known that 


qn = (1/5!) — (— Pn = Qn—1- 


\ 
q 


1943] SINGULAR MONOTONIC FUNCTIONS 


Now 
2 
=|x — Pn/gn| < 


which is of the same order as 1/02", whereas 
n — = (— 1)"(1/2") + (— 1)"*(1/2"*!) + --- 


is of order 1/2", that is, of order §7* log 2/log °, which proves that the number a 
of our Lipschitz condition is the best possible one. 
Fourier-Stieltjes coefficients of ?(x/2m). Let 


Cn = 


It is immediately seen that 


1 
= lim E | 
owe 


where the summation is extended to all fractions p belonging to M,. 

It does not seem to be known whether c, tends to zero for n= ©. If we 
confine ourselves to the behavior of c, “in the average,” we get the following 
result. It is well known by a theorem of Wiener(®) that 

cl? +--+ +] cul? < Amw(1/n) 
w(65) being the modulus of continuity of the function and A an absolute con- 
stant. Hence, by our result on the modulus of continuity of ?(x) we have 
| [2 | |? | Cn |? 2/ log 
and, by Schwarz’s inequality 
ler] +] co] cn] = 2/108 0), 


(*) See, for example, Zygmund, Trigonometrical series, p. 221. 


HARVARD UNIVERSITY, 
CAMBRIDGE, Mass, 

MASSACHUSETTS INSTITUTE OF TECHNOLOGY, 
CAMBRIDGE, Mass. 


ON THE PARTIAL SUMS OF FOURIER SERIES AT 
POINTS OF DISCONTINUITY 


BY 
OTTO SZASZ 


1. Introduction. Consider a Fourier sine series 


(1.1) (0) ~ >> b, sin v6, b, = (2/n) f sin v6d0, 
1 0 


and write 


(1.2) sn(0) = >> 6, sin vO, ee 
1 


Fejér proved (cf. Zygmund [5, p. 181 Ye) that if f(@) is of bounded variation, 
and if 6,—a as 9,—0, then 


(1.3) (2/m)f-+ 0) f dt = (2/x)f(+ 0)I(a). 


In particular, choosing a so that I(a)=2/2=f¢ sin édt (thus sin 
=0), we get s,(0,)—>f(+0), which is half of the jump of f(@) at @=0. 
On the other hand for a=7, which gives J(a) its maximal value 


* sin? 
sa(0n) (+ 0) f dt = f(+ 9) X 1.08949 --- 


Thus the limit points of the partial sums as @,—0 cover an interval which 
extends beyond f(+0), if f(+0) #0. This is cailed Gibbs’ phenomenon. 

It was also proved by Fejér and Csiil:e (for references and further results 
see Sz4sz [4]) that for functions of bounded variation 


(1.4) >> vb, (2/x)f(-+ 0), 


These facts suggest the consideration of 


sin v6, 
>. vd, 6, — 0, 


1 v 
as a transform of the sequence {b,}, that is, as a special case of the triangu- 
lar type transform 


Presented to the Society, September 8, 1942; recived by the editors September 8, 1942. 
(*) Numbers in brackets refer to the literature at the end of this paper. 


440 


THE PARTIAL SUMS OF FOURIER SERIES 


(1 5) T, = vy 
1 


where now 7,=vb,, d,,=v~! sin v@,. We shall not restrict ourselves to regu- 
larity conditions, and we shall not assume convergence of the sequence { Ta} : 
but merely Cesaro summability of some order. We then seek simple necessary 
and sufficient conditions for the convergence of the transform 7, (in general 
to a different limit). The application to Fourier sine series yields a generalized 
Gibbs’ phenomenon, and also a new device to determine the generalized 
jump of a function. Our results are in close relationship with some results of 
Rogosinski [1, 2]. 
We consider more generally the transform 


(1.6) T (en, On) > sin vp, 1, 0, 
1 


which in the case r,=vb, becomes 19%), sin vO, =Sn(On, On), where s,(p, @) is 
the mth partial sum of the harmonic series > wy p’b, sin v0. 

2. Permanency with respect to convergent sequences. It is well known 
that the convergence of the sequence {ra} implies the convergence of the 
transform T,, if and only if 


lim a,, = 0, for vy = 1, 2,3,--+3 


no 


| = 0(1), 0; 


lim >> ay, = o exists. 


1 


We then have lim 7,,=o lim r,. If we restrict ourselves to sequences T,— 0, 
then the last condition can be omitted. Applied to (1.6) this yields the neces- 
sary and sufficient conditions: 


(2.1) pv sin v@,| = O(1), 
1 
(2.2) lim sin =o. 
In particular the last condition is s,(p,, 8,.)—>0 for the harmonic series 


sin =arc tan {(o sin @)/(1—p cos 6)}. 
We first assume 


(2.3) 0 < lim inf p, < lim sup p, < @; 


441 


442 OTTO SZASZ 


in this case for some ¢>0, c2>0 


a> |sin | < sin | <a |sin |, 
1 1 


thus (2.1) reduces to 

(2.4) sin | = O(1), 
Now for any 

(2.5) sin v0 | < 


hence 26, =O(1) implies (2.4). To prove the converse let 0,<1<0,(m—1), 
and put so that Now 
—cos 2v6,), and 


cos 2x, | < +| cos 206, 
1 1 


«+1 


< 1+ log« + (1/(« + 1)) max | cos 296, 


a<ASn ! «+1 


< 1 + log 0, + 6,/sin 0, < 3 — log 4. 


2) sin > log + log — 3 = — 3 + log (n8,); 
1 


hence (2.4) implies 9,=O(1). For null sequences only this is required. 
To satisfy (2.2) consider the case that 0 is a limit point of the sequence { n0,} ; 
for a subsequence of indices m: 20,0, and for that subsequence, using (2.5) 


> sin = o( > sin v6, = O(nb,) = o(1). 
1 1 
Hence ¢, if it exists, is 0 and then every convergent sequence is transformed 
into a null sequence. Next assume lim inf 78, >0. We choose a subsequence of 
integers n=n’ for which p% and 76, have limits n’0,,-—8>0, p¥—e? say; by 
(2.3) is finite. Furthermore from log p/(p—1)—>1 as p—1, 
Suppose first y=0, that is py—1, and m’(p,,-—1)—0. Now, as m runs 
through the sequence {n’} 


> (on — sin < | pn — 1 sin | = 0(1)O(n6,) = 0(1); 
1 1 


[May 
| 
Thus 


THE PARTIAL SUMS OF FOURIER SERIES 


lim sin »@, = lim sin 
1 1 


for n=n'— , if either side exists. But 


n 6 n 6 oj 1/2 
= f ( cost) at f 
(m+1/2)0 sin ade 


(2m + 1) sin (u/(2n + 1) 


= (1/2)6 + 
hence 


sin % udu 


u “(Qn + 1) sin (u/(2nm + 1)) 


= — (1/2)0, +f + o(1) 
0 


1 
sin 
du 
0 u 
as n—»© through the sequence {n’}. The consideration of the case +0 re- 
mains; we write 


n 
sin v0 
1 


cos t—p?+-p**? cos nmt—p*t! cos (n+ 1)t 
1—2p cos 
*1—p*—(1—cos #)+"*![cos mt—cos (n+1)#]—(1—p)p**! cos nt 
(1—p)?+2p(1—cos #) 


dt 
o (1 — p)* + 4p sin? (¢/2) 


sin? (¢/2)dé 
o (1 — p)? + 4p sin? (¢/2) 


sin v0 = (1 — 


o (1 — p)? + 49 sin? (¢/2) 
sin (¢/2) sin + 1/2)édt 
+ 4p sin? (1/2) 


1943] 443 
hence 
in f dt, 
0 
thus 


OTTO SZASZ 


sin? (¢/2)dt On 
0 (1 mY Pn)? + 4p, sin? (¢/2) 4pn 


— 0 


Next 


dt 


n du 
n?[(on —1)? + 4p, sin? (u/2n) | 


nd, 


dy 2 are tan (6/2) 
= 2 arc ta 
Similarly 


f cos ntdt 
(pn — 1)pn o (1 — pn)? + 4p, sin® (¢/2) 


cos udu ® cos udu 
J, n*[(on — 1)? + 4p, sin? (u/2n)] ve f 

and 

ati sin (¢/2) sin (m + 1/2)édt 
J (1 — pn)? + 4p, sin? (¢/2) 
nti (2m + 1) sin {u/(2n + 1)}-sin udu 
n(2n + 1) [(on—1)* + sin? {u/(2n+1)} ] 
® uw sin udu 


(1/2)e" f 


Summarizing 
ye’ cos u + e%u sin u — 2y 
> par sin 
1 0 
re e7(¢ sin yt + cos yt) — 2 a 
0 1+? 


The case (2.3) is now completely discussed. We next assume 


du 


lim SUP p, = 0, 


n> @ 
so that for a subsequence n=n':p"—+ 0. We first prove that (2.1) implies 
n'§,,.—0. Otherwise for a subsequence n”’ of n’:n’’0,,-—-8>0. For these in- 
dices 


444 [May 
Now 
ut 


THE PARTIAL SUMS OF FOURIER SERIES 


"| sin ,| > pv | sin »,|, 
»Sa/6, 
where is so chosen that 0<a<§ and Now 


2p Pn 


| sin v,| > (2/2). pn = 


rSa/6,, 
hence (2.1) implies 
an/2B 


which by virtue of log pn/(pn—1)—>1 yields 6, =0(1). Furthermore 
Pn < | sin v0, | < On >, Pas 
1 1 


thus, if for a subsequence of indices pj, then for these indices (2. 1) is 
equivalent to 


If this condition is satisfied, then in view of ”0,—0 


1 


n 


sin 76, 
"<0, 0, 
(1- 
hence (2.2) holds if and only if lim 0,%/(p,—1) exists, which is then the value 


of 

Finally assume that lim inf p; =0; thus for a subsequence of indices p20 
which is (2.1). If on the other hand for a subsequence @,— ©, then 


"| sin | > 


2 


» -1 1 — cos 276, 


but | as v 7 , hence for 


1 =) 
> cos < ——— max cos 2v6, 
1+ ] Asn | 


6, 
< < 2/2. 
sin 6, 


Furthermore 


1943] 445 
: 
| 
. | 


OTTO SZASZ , [May 


= > vy exp (v log > =f u exp (— uw log p, )du 


n+l (n-+1) log pz! 


exp (— log pa )du = f tle-*dt; 


1+ (1+ log 


thus in this case (2.1) implies 6, =O(log 1/p,), or 0,=O(1—p,). If this con- 
dition is satisfied, then 


| sin 0, | = = 0(6,/(1 — pa)) = O(1), 
1 1 
hence (2.1) holds. To satisfy (2.2) now, we note that 


sin v0, = = O(pn) = 0(1), 
n+l 1 — pp 

hence (2.2) holds if and only if 

Pn Sin 6, 


= 
> pa sin 0, = arc tan 
1 1 —= Da cos 6, 


has a limit, and @ is then this limit. But 


Pn Sin 0, Sin 6, On 1 6, 
1— pn pn + pa(l — cos 1 — pr 1 + O(1 — pn) 1 — pn 


hence o exists, if and only if lim 0,/(1—p,)=5<+ 0. We then have 
o=lim arc tan {6,/(1 —pn)} =arc tan 6. To summarize our results put 


o(8, 0) = f 
0 
(a) 


du, for finite y ¥ 0, 


(8, 9) 
o(8, = 
0 


(c) o(5, — ©) = lim arc tan arc tani < 
— Pn 


We then have 


THEOREM 1. Necessary and sufficient conditions that for every convergent se- 
quence nb,—r the transform >-"p.b, sin v0, has a limit, are that one of the follow- 
ing three cases holds: 


446 
=* —1 
(b) o(0, ©) = lim 
Pn — 


1943] THE PARTIAL SUMS OF FOURIER SERIES 


(a’) n(pn.—1)—y finite, nb, < ~, 
(b’) n(px—1)—>+ lim exists, 
(c’) n(px—1)—>— ©, lim 0,(1 —p,)-! = 6 exists, OS 5< 


The limit of the transform is then ra, where ¢ is defined above for the re- 
spective cases. Different subsequences may belong to different cases (8, +) if 
only the corresponding ¢ attain the same value, and with the restriction 
n@,,=O(1) in case (a’). 

3. Permanency with respect to (C, x) summability. Given the sequence 


{r,}, write 

0 = 

Tn = Tns = Tn 3 = 1,2,3,-++; 
yen] 


also 
(e+ 1)-*+ 


n! 


(3.1) An = Catan = 


The sequence { Ta} is summable (C, x) to the value 7, if 7§/AS—r as n> &; 
(C, 0) is evidently convergence. 
We write 


A°r, = Tny = Ata A*r, = A(A*"1,); 


then by induction 
(3.2) = (— «=0,1,2,---, 


Abel’s transformation yields for finite sums 


> a7, = > = > = 
1 1 1 
where @n41=0, On42=0, - - -. Applying this to (1.5) we get 
T, = On», 


where a,,=0 for »y>n. Thus the transform converges for every (C, x) sum- 
mable sequence if in addition to the conditions of §2 


| = O(1) asn—> ©, 


In particular for the transform (1.6) we have the conditions (2.1), (2.2) and 


(3.3) Fa sin + > A;| &,| = O(1) asn—>o, 


447 
n* 
x! 
| 


448 OTTO SZASZ 


where, from (3.2) 
= » , sin (A + v)6, 
& = (— 1) 
v= 0 r + v 


We first consider (C, 1) summability (k=1). Now (3.3) becomes 
> (» + 1)| sin 6, | + (n + 1)p, sin 0,| = O(1), 
1 
or 


n—1 
(3.4) pr sin — pa (» +1) sin (» + 1)0,| + sin = O(1). 
1 


We consider in succession the different cases of Theorem 1. 
(a’) For a sequence of indices ”8,—-8< ©, n(p,—1)—y finite, that is, 
>0. Thus pf sin is O(1), and 
n—1 
> vp» | sin v0, — pa(v + 1) sin (vy + | 
1 


n—1 


< ¥ vp.|» sin — + 1) sin (» + 1)6,| 
1 


+ |1—p,| > sin (v + 1)6, |. 
1 


Now 
n—1 n 
| 1 — sin (> + <|1— pn] pn < pn| 1 — = O(1), 
1 1 
and 
v| sin v0, — (v + 1)—' sin (v + | 
= | (» + sin (v + 1)0, — 2 sin (1/2)0, cos ((2» + 1)/2)6,| < 26,3 


hence 


sin — (» + sin (» + | < 20,55 = O(n0,) = O(1). 


1 


Hence in this case no additional condition results. 

(b’) ©, Hence 8,—0, and now sin 76, 
=0O(1) is equivalent to 0,p,=O(1). Thus 
—0, that is, c=0. Now 


n—1 n 
[1 — sin (> + 1)4,| = o| 1)0, 3 
1 1 


= — 1) ] = o(1); 


[May 
| 1 


1943] THE PARTIAL SUMS OF FOURIER SERIES 


furthermore 


On >> Pn = — = o(1); 
1 
hence (3.4) holds. Finally: 
(c’) If lim 6,/(1—p,) < © exists, and n(p,—1)—>— ~, that is, pr-0, then 
sin 16,0, sin (v+1)0,| <(1—pn) p2=1, and 


= O(1). 


1 — Pn 
No additional condition appears in this case. Summarizing, we have 


THEOREM 2. Necessary and sufficient conditions that when lim n-!)-*vb, =r 
exists the transform >.*p%b, sin v0, has a limit: ro, are either of the alternatives: 

(a’’) n(p,a—1)—y, finite, < ©, 

(b’’) n(pn—1)>+ ©, =O(1), 

(c’’) ©, lim 0,(1 — exists. 


The value of o is in the cases (a’’) and (c’’) given by (a) and (c). In case 
(b’’) ¢=0. Different subsequences may belong to different cases if only o has 
the same value, with the restriction 70, =O(1) in case (a’’). 

We now consider (C, x) summability for x>1. First of all, to satisfy (3.3) 
we must have 

n—m+» Sin (2 — m+ 


(3.5) p> (— 


= O(1), 


Or 
sin 70, = O(1), 
en (nm — sin 76, 


N Pn — KPn O(1), 
n—1 n 


Pa — 


This is equivalent to 
sin 10, = O(1), 


np. sin (n — = O(1), 


np, sin (n — x + 1), = O(1). 


449 
cn i = 1 On «— x i nb, 
(n — «+ 1) +(-1) 1 sin = O(1). 
nN | 
| 
(3.6) 
| 
| 


450 OTTO SZASZ : [May 


In case (a’’) the first condition becomes sin 20, =O(n'-*), as n— © ; in par- 
ticular sin 20,0, thus in view of (a’’) 26,—dz, Xd a positive integer or zero. 
On putting 20, =Ar+e,, we get sin €, =O(n'—*), or =O(n'—*). From 
the second condition now cos 78, sin 0, =O(n'-*), asn—> ©, or Aw+€, =O(n?-*) ; 
hence for x=2, (3.6) reduces to 
(3.7) nb, = + O(n"). 
For x>2 we must have 
nb, — = = O(n") and Ar+e, = O(n), 

hence \=0, and 
(3.7’) = O(n'*). 
It then follows that 

n*-' sin (n — v)0, = O(1) fory = 0,1,---,«-—1. 


Furthermore, for the rest of (3.3) 


A,| sin | = o( sin | ). 
1 


1 


8 
A‘p’v— sin v0 = A*p” f cos vidi = R f A*‘z’dt, pe*, 
0 


and, using (3.2) 


6 
A‘p’v-! sin = R Dd (= = Rf — z)*dt, 
0 


A=0 


hence 
6 6 
0 0 


< 6p"{ (1 — p)? + «/2, 
Thus 


sin 0, | < ( > - pn) + pr 
1 1 


= 2), 


and, from p,=O(1), 


Now 

(3.8) 

1 


THE PARTIAL SUMS OF FOURIER SERIES 


6, O(nd,) = O(1). 


Hence in case (a’’) the additional condition is (3.7) for x=2, and (3.7’) for 
k>2. 

In case (b’’): n(p,—1)—>+ ©, as hence 
and 26,—0. Now (3.6) becomes 


(3.9) = O(1). 


For large n evidently p,>1, and 
On >> Pn < = O(1) 
1 


(from (3.9)). In view of (3.8) now (3.3) holds. Thus in this case the additional 
condition is (3.9) (for x22). 

Finally, in case (c’’): 2(p2—1)—>— © (that is p20), and lim 0,/(1—pn) 
=5< exists. Now pp<1/(n(1—p,)), hence 6,p%=O(1); thus for =2 con- 
dition (3.6) reduces to mp} sin 8,=O(1). While for x>2 (3.6) reduces to 
sin n0,=O(1) and n*-#,p,=O(1). Furthermore, as p,<1, 
<6,/(1—p,) =O(1), hence, in view of (3.8) now (3.3) is satisfied. 

We summarize our results in 


THEOREM 3. In order that lim >-%p2b, sin v0,=7o0 exists, whenever (C, «) 
lim nb, =r for some x22, necessary and sufficient conditions are the alternatives: 

(a’’’) n(pn—1)—y, finite, and for x=2: nO, =d\x+O(n-"), d an integer, for 
k>2:0,=O(n-*); 

(b’’’) n(p,—1) ©, =O(1); 

(c’’’) n(pa—1) — ©, exists, and for x=2: 
sin nb,=O(1), for n*p8(0,+|sin n0,|)=O(1). 


The value of o@ is given in case (a’’’) by (a), where for x=2: B=Xz, for 
x>2:8=0, in case (b’’’): ¢=0; in case (c’’’): o=arc tan 6. 

4. Application to Fourier series. First consider a function of bounded 
variation and its Fourier sine series (1.1). It follows from the introduction 
that lim b,, if it exists, is (2/r)f(+0). Under the assumptions of Theorem 1 
on pn and On, Sa(Pny On.) = (20/x)f(+0). In particular whenever ¢>x/2, 
then we have an analogue of Gibbs’ phenomenon. It is known that for func- 
tions of bounded variation 


vb, —> (2/x)f(-+ 0); 


more generally if (cf. Sz4sz [3, Lemma 6]) 


1943] 451 
| 
1 
| 
| 


OTTO SZASZ 


6 
= (2/0) flt)dt j, 


n+« 
lim liminf min >> 0, 


élo 


(1/n) 


Hence, applying Theorem 2 we have 
On) ¥), as m(p, — 1)—> y and 6, — B; 


j is the generalized jump of f(@) at @=0. For y=0 this yields a generalization 
of formula (1.4). Note that 
1 — cos v0 sin (v@/2) \? 
dt = b, ———— = (0/2 
(20/2) { so/ v0)2s, } is called the Riemannian mean of the second 
kind corresponding to the sequence {sn}. It is a regular transform, as is seen 


from the identity 
20 ; 2 
{1/2 + >(="*) = 1, 
1 v0 


If we assume only that (C, 2) lim 2b, =j/m exists, then Theorem 3 yields 
again a Gibbs’ phenomenon in the case (a’’’) and A>0. 
In this connection we introduce two lemmas. 


Lemma 1. If 


(4.3) (1 —>T 


and 
(4.4) Tr = > pn, 
1 


for some p>0, and all n>0, then 
(4.5) (C, 2) lim r, = rf. 
We have from (4.3) 


452 [May 
and 

(4.2) 

then 
TR 
1 


THE PARTIAL SUMS OF FOURIER SERIES 


(i- 
1 


p> (rd + t p 


in view of (4.4) a theorem of Hardy and Littlewood yields 


~ (1/2)rn?, 


1 


which is (4.5). 
Lemma 2. If (4.1) holds, then (1—1))>?nb,r"—j/x. [3, Lemma 5]. 


Combining these two lemmas it is seen that (4.1) and the assumption 


(4.6) > », > — pn for some p > 0 and all > 0, 


1 


imply (C, 2) lim 2b,=j/x. With reference to Theorem 3 the assumptions 
(4.1) and (4.6) again yield a Gibbs’ phenomenon. 

In closing we remark that the existence of (+0) implies itself (C, 2) lim 2b, 
=(1/2)f(+0). A more general result will be given elsewhere. 


REFERENCES 


1. W. Rogosinski, Ueber den Einfluss einseitiger Eigenschaften einer Funktion auf thre 
Fourierrethe, Schriften der Koenigsberger Gelehrten Gesellschaft Naturwissenschaftliche Klasse 
vol. 3 (1926) pp. 57-98. 

, Abschnittsverhalten bet trigonometrischen und insbesondere Fourierschen Reihen, 
Math. Zeit. vol. 41 (1936) pp. 75-136. 

3. O. Sz4sz, Convergence properties of Fourier series, Trans. Amer. Math. Soc. vol. 37 (1935) 
pp. 483-500. 

4. , The jump of a function determined by its Fourier coefficients, Duke. Math. J. 
vol. 4 (1938) pp. 401-407. 

5. A. Zygmund, Trigonometrical series, 1935. 


UNIVERSITY OF CINCINNATI, 
OHIO 


1943] 453 
hence 
or 

asn—> ©, 


THE CHARACTERISTIC OF A QUADRATIC FORM FOR 
AN ARBITRARY FIELD 


BY 
RUFUS OLDENBURGER 


1. Introduction. Ernst Witt [1](*) has shown that for a field K with char- 
acteristic not 2 each quadratic form Q can be transformed into a form G+H 
where 


G= 


H is a nonzero form, that is, does not represent zero properly, the rank of Q 
is the sum of the ranks of G and H, and G has rank 2c. The number @ is an 
invariant of Q under nonsingular linear transformations on Q. For the real 
field the number g was defined by Loewy [2] as the minimum of the indices(?) 
of Q and —Q, and termed the characteristic of Q. Loewy showed that this 
characteristic could be defined in terms of exponents of elementary divisors 
of pencils {p*—Q} formed from Q and real quadratic forms {F }. The defini- 
tion of Loewy does not extend to an arbitrary field K, whereas the character- 
istic of Q for K is arrived at by Witt through the examination of a sequence 
of quadratic forms Q, Hi, - - - , H., where for each s the form Q is G,+H, for 


= (Li - M)), 
the L’s and M’s being linearly independent linear forms, while the rank of Q 
is the sum of the ranks of G, and H,. In the present paper we shall show that 
the characteristic of Q can be defined in terms of linearly independent linear 
forms directly associated with Q. This definition is particularly convenient 
for treating sums of forms. 

With the aid of the viewpoint developed here it is proved (§§3—4) that 
the characteristic of a quadratic form Q is related to the ranks of the prin- 
cipal minors of matrices associated with Q. By means of this relation we are 
able to treat the characteristics of sums of forms, and to show in particular 
(§5) that the characteristic of Q+ALZ?, L linear, differs at most by 1 from 
that of Q. This property is likewise possessed by the rank of Q, and, if K is real, 
also by the index of Q. The above result on the characteristic of Q+AZ? will 
be used in another paper [3] to show that the characteristic o of a quadratic 


Presented to the Society, April 3, 1942; received by the editors February 13 and Septem- 
ber 2, 1942. 

(*) The numbers in brackets refer to the bibliography at the end of the ogee 

(*) The index of Q is the number & of + signs in a canonical form x,+ *+x—ei 
— +++ —x? to which Q is equivalent under a nonsingular linear transformation. 


454 


CHARACTERISTIC OF A QUADRATIC FORM 
form Q determines the minimum value 7 for which Q can be written as 
where the L’s and M’s are linear forms. The maximum vaiue which can be 
attained by the characteristic of Q relative to the rank r of Q is [r/2], where 


[r/2] designates the largest integer not exceeding 1/2. 
Asum 


t=—1 


where the L’s are linear, and r is the rank of Q, is a minimal representation of Q. 

2. Preliminary definitions and conventions. Throughout the present paper 
the usual restriction that the characteristic of the field K be different from 2 
is made in order that each quadratic form Q may be written as 


n 
(2.1) Q » 5X 
i, jul 

where the matrix (a;;) of coefficients is symmetric. We term (a;;) the matrix 
A of 2. 

In what follows we shall use the term “equivalent” to mean equivalent 
under nonsingular linear transformations. 

We define the characteristic of a quadratic form Q to be the maximum 
number o of linearly independent linear forms Z;, ---, Z,. such that the 
rank of 


is identical with the rank of Q for all values of the X’s. In what follows the 
term “characteristic” will be understood to refer to the invariant just defined, 
until we have proved (§3) that this invariant is identical with the character- 
istic of Witt, and for the real field with that of Loewy. 

If a form Q with rank r is written as a sum G+H where G has rank 2¢ 
and characteristic ¢, and H has rank r—2¢ and characteristic 0, we term 
G+H a characteristic splitting of Q. If we write Q as 


(2.3) + H, 
t=] 


where the component in the L’s and M’s is identical with the form G in a 
characteristic splitting of Q, we have a decomposition of Q. 

If the rank of the form (2.2) is identical with the rank r of Q for all values 
of the X’s, the forms Zi, - - - , ZL, clearly depend only on the variables which 
occur in Q. A stronger statement can be made. We let 


Q= 
| 
‘ | 
| 
| 


RUFUS OLDENBURGER 


+++ + Me 


be a minimal representation of Q. Since Mi, - - -, M, may be taken as the 
variables in terms of which Q is expressed, it follows that the L’s are linear 
forms in the M’s, whence the characteristic of Q cannot exceed r. 

3. Characteristic splittings. In the present section we shall relate the char- 
acteristic of a quadratic form Q to the results of Witt. 

We recall that a matrix D of order c and rank d has nullity c—d. 


LemMA 3.1. The characteristic of a quadratic form Q of rank r is the maxi- 
mum o for which Q is equivalent to a form F with rth order matrix C where com- 
plementary principal minors Cu, C22 of C have order and nullity o, respectively. 


In considering matrices {c } with complementary principal minors Cu 
and Cy, it will be no restriction if we take Cy to be a leading minor so that 


Cu C 
(3.1) 
Cu Coo 
We write Q as in (2.1) with =r. We suppose that Q has characteristic o. 
We may assume without restriction ‘that the rank of (2.2) with L;=x; for 
each ¢ is identically equal to r for all values of the \’s. We assume that ¢21. 
We let M denote the minor of the matrix A of Q obtained from A by deleting 
the first ¢ rows and ¢ columns of A. Expanding the determinant of the form 
(2.2) we find that the rank of the form (2.2) is r for all values of the X’s if 
and only if each principal minor determinant of A containing M vanishes, 
except the determinant |A| itself. The rank of M is a number 6, where 
b<r—o. We may suppose that M is in the shape 
(3.2) | D0 
0 0 
where D is a nonsingular minor of order b. We suppose, for the moment, that 
b2=r—2e, and consider the minor determinants of A of the type 


« £ 


(3.3) 
E00 


where E has order r—a—b, and E’ denotes the transpose of E. If )>r—2ae, 
the minors of type (3.3) are distinct from |A|. Since the vanishing (3.3) 
implies the singularity of E, the last r—a—b columns of A are linearly de- 
pendent, a contradiction. Thus ) <r—2¢. If b<r—2c¢, the matrix A is singu- 
lar. Thus b=r—2e. 

Conversely, if Q has the matrix C where C2: has rank r — 2¢ and order r—a, 
we adjoin elements from g rows and g columns of C to Cx to obtain a minor 


456 ee [May 


1943] CHARACTERISTIC OF A QUADRATIC FORM 457 


of C of order m, where m =r —o +4, with rank at most r’, where r’ =r —20+2¢. 
If ¢>0, the relation g<o implies that r’<m, whence each square minor of C 
containing Cz, except C, is singular. The rank of (2.2) with L;=x, for each + 
is now 7 for all choices of the \’s, whence the characteristic of Q is at least c. 


THEOREM 3.1. A quadratic form Q with rank r has characteristic o if and only 
tf Q has the characteristic splitting G+-H, where G has rank 20 and characteristic 
o, while H has rank r—2c and characteristic 0. 


By a result of Witt, quoted in the introduction, the form Q is equivalent 
to asum 


(3.4) Fons + H, 
tel 


where H is a nonzero form with rank r—2p. The number p (by the theory of 
Witt) is uniquely determined by Q. Further, if Q is equivalent to a form (3.4) 
where H is a zero form with rank r—2p, the form Q is equivalent to a sum 
(3.4) with p replaced by a larger number p’, where H now is a nonzero form 
with rank r—2p’. 

We let o denote the characteristic of Q. Since the rank of 


is r for all values of the A’s we have 2p. 

By Lemma 3.1 the form Q is equivalent to a form Q’ with the matrix C, 
given in (3.1), where the order of Cy and the nullity of C22. equal ¢. In view 
of the nullity of Cx. we may assume that Cz is written as (3.2) where the order 
of D is equal to r—2¢. We write Q’ as (2.1) with m=r, whence (a;;) =C. The 
form Q’ is now the sum G’+H’, where 


(3.5) G = «ili, 

the L’s are linear forms and H’ is the form with matrix D. Since Q’ has 
rank r, the variables x, - - - , x, in G’, as well as %.41, +, in H’, and 
Ii, - - +, ZL. comprise a set of linearly independent forms, whence these may 
be taken as the variables in terms of which Q’ is expressed. It follows from the 
Witt theory that p2¢, whence p=c. 

It is readily seen that the component in the x’s and L's in (3.5) has char- 
acteristic ¢, whereas H’ has characteristic 0. 

From Theorem 3.1 there are a number of immediate consequences valid 
for an arbitrary field K. The characteristic of a quadratic form Q of rank r 
does not exceed [r/2]. The characteristic of a quadratic form Q of rank r 
attains the maximum value r/2 if and only if Q is equivalent to the canonical 
form 


| 
| 
| 
| 


RUFUS OLDENBURGER 


r/2 


XiVi- 


A quadratic form Q is a nonzero form if and only if the characteristic of Q is 0. 

For the complex field the characteristic of a quadratic form Q is [r/2]. 
For the real field the characteristic is clearly the minimum of the indices of 
Q and —Q. The index and characteristic of Q may thus be distinct. In fact 
the concept of index of Q is identical with that of characteristic and the type 
(+) of definiteness of a nonzero component H in a canonical splitting of Q. 

From a result of Dickson [4] one can readily show that the quadratic 
forms 


with a; =e; and rank n, are equivalent if and only if the subforms 
= 2 = 2 
im? im? 


are equivalent. Witt [1] proved this same result by different methods, and 
showed that quadratic forms with canonical splittings G+H and G’+H’, 
where H and H’ are nonzero forms, are equivalent if and only if they have 
the same characteristic, and H is equivalent to H’. Thus H is uniquely deter- 
mined up to equivalence. The study of the equivalence of quadratic forms 
thus reduces to that of nonzero forms, so extensively treated in the litera- 
ture(*). 

4. Characteristics and principal minors. To treat characteristics of sums 
of forms we shall need some properties of principal minors developed here. 


LeMMA 4.1. If the matrix of order r of a quadratic form Q of rank r has a 
principal minor of nullity o, the characteristic of Q is at least o. 


Without restriction on the generality of the method we may suppose that 

the matrix of order r of Q(m, - - - , x,) is given by 

Du Diy Dis 
(4. 1) A = Da De 0 

Ds, 0 0 
where the bottom right zero represents a minor of order a, and D2: is non- 
singular. We let ¢ designate the number of rows of Dis. By Lemma 3.1 we 
may restrict ourselves to the case where ¢>ce. By a nonsingular linear trans- 


formation affecting only the variables x, - - - , x:, the form Q can be brought 
into a form Q’ with matrix B, where the principal minor of B obtained by 


(*) See, for example, [5]. 


458 [May 


1943] CHARACTERISTIC OF A QUADRATIC FORM 459 


striking out the first ¢ rows and columns of B is identical with this minor 
for A, and D,; is replaced by a nonsingular oth order minor followed by rows 
of zeros. The matrix B is of the type C given in (3.1), where Cu, Co: have 
order and nullity o, respectively, whence by Lemma 3.1 the characteristic 
of Q is at least o. 

In Lemma 3.1 we showed how the characteristic o of a quadratic form Q 
is the maximum value ¢ for which certain minors Cu, C22 possess given prop- 
erties. We shall show how a further examination of these minors reveals 
whether or not the maximum value g is attained for them. In the following 
theorem the characteristic of C2: is understood to be the characteristic of the 
quadratic form associated with Cy. 


THEOREM 4.1. Suppose that the order and nullity of complementary principal 
minors Cu, C22 of the matrix of order r of a quadratic form Q of rank r equal a, 
respectively. The characteristic of Q is o if and only if the characteristic of C22 is 0. 


We write Q as in (2.1) with =r. Since Cz: has nullity o there is a non- 
singular matrix M such that MC2M’ is the minor (3.2) where D has order 
r — 2c. It will thus be no restriction on the generality of the method to suppose 
that the matrix A of Q has the shape (4.1) with Da. =D, and the order of Dy 
equal to ¢. We can thus write Q as asum G’+H’, where G’ is given by (3.5), 
1’ is a form with the matrix D, and the rank of Q is the sum of the ranks of 
G’ and H’. The characteristic of G’ is clearly o. 

If Q has characteristic ¢, the sum G’+H’ is a characteristic splitting of Q, 
whence #7’ has characteristic 0. It follows that Cz. has characteristic 0. 

If conversely, the characteristic of C22 is 0, the sum G’+H’ is again a 
canonical splitting, whence the characteristic of Q is c. 

We consider the canonical splitting G+H, where 


G > Vi 


and H is a form in x41, - + + , X;-s. Since the matrix of G+H is of the type 
(3.1) with the order of Cy and the nullity of C22: equal to ¢, and the character- 
istic of C22 equal to 0, the characteristic of a quadratic form Q is o if and only 
if Q is equivalent to a form with the matrix (3.1) where complementary prin- 
cipal minors Cy and Cy: have the properties just mentioned. 

5. The characteristic of a sum of forms. Each quadratic form Q is equiva- 
lent to a quadratic form with a diagonal matrix. This is the same as the prop- 
erty that each quadratic form Q has a minimal representation. It follows that 
the study of the effect of the addition of a quadratic form F to a quadratic 
form Q reduces to the study of the addition of a term AL?, L linear, to Q. 


THEOREM 5.1. Under addition of a term \L?, L linear, to a quadratic form Q 
the characteristic o of Q changes at most by 1. 


= 
| 


460 RUFUS OLDENBURGER [May 


We suppose that the characteristic of Q+AL? is at least ¢+2. We let g 
designate the rank of Q+AL*. By Lemma 3.1 the pair (Q, LZ) can be trans- 
formed nonsingularly into a pair (Q’, M), where the matrix C of Q’+AM? is 
of order q, and is of the shape (3.1), while Cu and Cx: have order and nullity 
o+2, respectively. 

The rank of a quadratic form Q changes at most by 1 under addition of a 
term AL?, L linear, to Q. We shall assume, to begin with, that g=r, where r 
is the rank of Q. The addition of —AM? to Q’+AM? changes the nullity of 
Cx, by at most 1. Thus Cz goes into a minor Dz with nullity at least ¢+1. 
By Lemma 4.1 the characteristic of Q’ is at least 7+1, a contradiction. It 
follows that the characteristic ¢ of Q+AL? does not exceed ¢ +1. Thus 


We now consider the case where g=r+1, and designate the variables in 
Q’+AM? by y1, Where the rank of 


O + AM + 


is r+1 for all values of \y, - - - , Aey2. By the development used in the proof of 
Lemma 3.1 the form Q’+AM? has a matrix C of order r+1 as given in (3.1) 
where Cy and Cz are minors of order and nullity ¢+2, respectively. If M is 
linearly independent of y:, - - - , ¥e42, We May suppose that M=y,,3. Removal 
of the row and column of C corresponding to y.43 yields the matrix of Q’ of 
order r. The minor obtained from Cz. by removal of this row and column has 
nullity at least ¢+1. By Lemma 4.1 the characteristic of Q’ is at least o+1, a 
contradiction. If, on the other hand, the form M is linearly dependent on 
Yu * * *Ve42, the minor Cz: of C is a minor of a matrix C* of order r+1 of Q’. 
Since Cz has nullity ¢+2, there is a nonsingular matrix N such that N’C*N=A, 
where A is given in (4.1) the minors Di; and Ds being of order ¢ +2, while Dez 
is nonsingular. Since A is singular, Dj; is singular. It follows that there is a 
nonsingular matrix M such that M’A M is identical in shape with A except 
that the last column of Dj; is replaced by a column of zeros, and a correspond- 
ing remark holds for the last row of Dj. We drop the last row and column of 
M’'AM to obtain a matrix B whose lower principal minor of order r—o0 —3 has 
nullity ¢+1. By Lemma 4.1 and the invariance of the characteristic under 
nonsingular linear transformations we have again arrived at a contradiction. 
Thus in any event, when g=r+1, the characteristic of Q+AL? does not ex- 
ceed ++1. 

Since the index of Q is o there are linearly independent linear forms 
Ih, - - -,Z, such that the rank of (2.2) is r for all values of the X’s. If g=r+1, 
the form Z is linearly independent of the variables in Q, whence the rank of 


+ +L 


1943] CHARACTERISTIC OF A QUADRATIC FORM 461 


is r+1 for all values of \;, - - -, A.. Thus if the rank of Q+AL? exceeds the 
rank of Q the characteristic of Q+AL? is at least as great as that of Q. 
The case where g=r—1 reverts to the preceding. 


Coro.iary 5.1. If p and R denote the characteristic and rank of the quad- 
ratic forms Q and F, respectively, the characteristic of Q+F satisfies the in- 
equalities: 


We shall show that Theorem 5.1 is valid if we replace the characteristic 
of Q by the index of Q for the real field. In the following theorem all coeffi- 
cients are understood to be in the real field. 


THEOREM 5.2 (Analogue of Theorem 5.1). Under addition of a term dL?, 
L linear, to a real quadratic form Q, the index of Q changes at most by 1. 


It is readily seen that Theorem 5.2 is true when the ranks of Q and 
Q+ AL? are distinct. We suppose, therefore, that these ranks are identical. 


We have 


where r is the rank of Q, while g is the index of Q+AL?, and the M’s are linear 
forms. We suppose that g>4+1, where h is the index of Q. We may suppose 


that Q is written as . 
2 8 

> >. Xe 

We set x1= -- + =x,=0. Since Mi, - - - , M, are linearly independent to be- 
gin with, at most hk of these vanish. We may assume, therefore, that Mas: 
and are linearly independent of - - - , 
= M,=0. We have 


r 
(5.1) > 
We have imposed at most r—g+A-+1 independent conditions on the vari- 
ables in Q. It follows that the left and right members in (5.1) do not vanish 
identically, whence we have a contradiction. 

The theorems above and well known rank theory now imply that the 
rank, characteristic, and, if the field K is real, also the index of a quadratic 
form Q, have the common property that they change at most by 1 under 
addition of a term AL?, L linear, to Q. 


| 
| 
| 


RUFUS OLDENBURGER 


BIBLIOGRAPHY 


1. Ernst Witt, Theorie der quadratischen Formen in beliebigen Kérpern, J. Reine Angew. 
Math. vol. 176 (1937) pp. 31-44. 


2. Alfred Loewy, Ueber Scharen reeller quadratischer und Hermitescher Formen, ibid. vol. 122 
(1900) pp. 53-72. 


3. Rufus Oldenburger, Expansions of quadratic forms, Bull. Amer. Math. Soc. vol. 49 
(1942) pp. 136-141. 


4. Leonard E. Dickson, On quadratic forms in a general field, ibid. vol. 14 (1907) pp. 108- 


115. 
5. Helmut Hasse, Uber die Darstellbarkeit von Zahlen durch quadratischen Formen im Korper 
der rationalen Zahlen, J. Reine Angew. Math. vol. 152 (1923) pp. 129-148. 


ILLINOIS INSTITUTE OF TECHNOLOGY, 
Cuicaco, ILL. 


ON THE OSCILLATION OF DIFFERENTIAL TRANSFORMS. IV 
JACOBI POLYNOMIALS(') 


BY 
G. SZEGO 


1. Introduction. In a paper in the Trans. Amer. Math. Soc.(?), E. Hille 
proved the following 


THEOREM A. Let a20, B20, c20. The differential operation 
(1.1) &8—c=(1— 2)D?+ D=d/dz, 
does not diminish the number of the sign changes in the interval —1<x< +1. 


More exactly, let y=y(x) be a real-valued non-constant function of x, 
—1s<x< +1, with a continuous second derivative (with one-sided derivatives 
at the end points +1). Then the number of the sign changes of Y= (0 —c)y 
in —1, +1 is not less than that of y in the same interval(*). 

First let us observe that under the conditions mentioned Y cannot vanish 
identically—this being true even for a>—1, B>-—1. More precisely, the 
solutions of the differential equation (#@—c)y=0 which are not identically 
zero cannot have a continuous second derivative in the closed interval 
—1sx3+1, provided c>0; in the case c=0 the solution y=const. is the 
only one of the kind mentioned(*). Indeed, let us assume that c>0, and let 
u(x) and v(x) be the solutions of the differential equation mentioned regular 
at x=+1 and x=-—1, respectively, and satisfying the condition u(+1) 
=v(—1)=1 [see (2.1)]. Then by means of the table in §2 below we con- 
clude that u(x) and v(x) are linearly independent [u’(x)—>~, v’(x) =O(1) as 
x——1+0 and u’(x)=O(1), v’(x) as x—1—0]. Moreover { cru (x) 
+cyv(x)}’—+0 either for x+>—1+0 or for x—+1—0 (or in both cases) unless 

In the same paper E. Hille proved by means of Theorem A the special 
case c=0 of the following 


THEOREM B. Let a20, B20, c20 and let 3 have the same meaning as in 


Presented to the Society, February 27, 1943; received by the editors October 17, 1942. 

(*) See the previous papers of this series by G. Szegé, E. Hille and A. C. Schaeffer, in the 
Trans. Amer. Math. Soc. vols. 52, 53 (1942-1943). (Cf. below, loc. cit. footnotes 2 and 6.) 

(2). E. Hille, On the oscillation of differential transforms. 11. Characteristic series of boundary 
value problems, Trans. Amer. Math. Soc. vol. 52 (1942) pp. 463-497; see §2.8. 

(*) Regarding the definition of the number of sign changes see, G. Pélya and G. Szegé, 
Aufgaben und Lehrsdtze aus der Analysis, vol. 2, 1925, p. 40. 

(*) G. Szegé, Orthogonal polynomials, Amer. Math. Soc. Colloquium Publications, vol. 23, 
1939; see p. 61, (4.2.6). 


463 


| 
| 
| 


464 G. SZEGO ° [May 


Theorem A. We denote by f(x) a real-valued function possessing derivatives of all 
orders in —1Sx5S+1. If the number of the sign changes of the functions 
(8 —c)*f(x), R=1, 2,3, +--+, 48 bounded, say at most N, then f(x) is a polynomial 
of degree at most N. 


The purpose of the present note is to prove 


THEOREM A’. Theorem A remains true under the more general condition 
a>-—1,8>-—1, c20. 

THEOREM B’. Let a and B be arbitrary real, c=0. If f(x) satisfies the condi- 
tions of Theorem B, f(x) must be a polynomial of degree at most N+-y. Here the 
constant y =(a, B, c) depends only on a, B and c. 


Assuming a>—1, 8>—1, Theorem B’ (with y=0) can be derived from 
Theorem A’ in a manner used first by G. Pélya and N. Wiener in case of 
Fourier series(*) and applied later to numerous other instances by E. Hille 
(loc. cit.). We prefer however a direct proof of Theorem B’ based on an 
idea which was used in the first paper of the present series(°). 

2. Proof of Theorem A’. First we assume c>0. Let u(x) be the uniquely 
determined solution of (@—c)y=0 which is regular at x= +1 and for which 
u(+1)=1 holds; we have as well known 


u(x)=F(k, k’; 1; (1—x)/2) 
+++ 


where k and k’ are the roots of the quadratic equation k(—k+a+6+1) =c 
and /=a+1. Since (k+v)(k’+v) v=0, 1, 2,--+, we 
have u(x) >0 and u’(x) <0 in —1<x3 +1. Incidentally, k and k’ are differ- 


ent from 0, —1, —2,--++;1>0. 
Let us investigate the behavior of u(x) and u’(x) as x—>—1+0. Since 


_nP- 
Cesiro’s theorem(’) can be applied to u(x) provided 820 and to u’(x) pro- 
vided 8 > —1. We obtain the following table: 


(5) G. Pélya and N. Wiener, On the oscillation of the derivatives of a periodic function, Trans. 


Amer. Math. Soc. vol. 52 (1942) pp. 249-256. 
(*°) G. Szegé, On the oscillation of differential transforms. 1, Trans. Amer. Math. Soc. vol. 52 


(1942) pp. 450-462. 
(*) See, for instance, G. Pélya and G. Szegi, Aufgaben und Lehrsitse aus der Analysis, 


vol. 1, 1925, p. 14, Problem 85. 


((1—)/2)* 


1 
» NO, 


JACOBI POLYNOMIALS 


u(x)~ —u'(x)~ 
B>0 (1+2)-*! 
B=0 —log (1+2) 
—1<Bs<0 (1+)! 


The symbol f(x) ~g(x) means that f(x)/g(x) approaches a positive limit as 
x——1+0. 
We also note the identity 


(2.4) = (0 — dy = (1 — + 
2) = H(2)(y'u — yw), = (1 — + 


Now let y have WN sign changes in —1<x<+1, N>0O; then NW abscissae 
a, exist, >ay >ay41= —1, such that y is alternately less 
than or equal to 0 and greater than or equal to 0 in the intervals a,4:, a, 
without being identically zero in these intervals. We may assume that in an 
arbitrary small left-hand neighborhood of a, there are abscissae for which 
y#0, 1SvSN. (By this condition the a, are uniquely determined.) Obvi- 
ously y(a,) =0, 1SvSN. Then by Rolle’s theorem we conclude the existence 
of at least N—1 zeros for u?(y/u)’=y’u—vyu’ hence also for #(x) between ay 
and ay separating the abscissae a,; in addition lim #(x) =0 as x1 —0. 

But é(x) must have also a zero in —1<x<ay. Assume the contrary, for 
instance t(x) <0 or (y/u)’<0 in —1<x<ay. Then y/u is decreasing in this 
interval and since y(ay) =0 we must have y>0 in —1<x<ay and y>hu in 
—1<x Say —e[0<e<ay+1, h=h(e) >0]. 

In case B20 we conclude that y>+ as x——1+0 [see table (2.3) ] 
which is a contradiction. 

In case —1<8<0 we obtain y>h’ (h’>0) for —1<xSay—e. But in this 
case as x>—1+0 so that 


— (1 + yu’ — (1 + 


hence #(x) >0 when x is sufficiently near —1. This is again a contradiction. 

Recapitulating, we have fou1.d certain zeros Bo, 61, - - - , Bw of t(x) satis- 
fying the inequalities >By1>Bw>—1 and 
1Sv3N. Repeated application of Rolle’s theorem furnishes at least N sign 
changes of Y. Note that #(x) cannot be identically 0 in 8,41, 8, since this 
would imply y/u=const., hence y=0 on account of y(a,4:) =0. But yx0 at 
suitable points to the left from @41. 

The remaining case c=0 can easily be settled. The identity (2.4) holds 
then with u(x) =1, that is, i(x) =H(x)y’. In this case #(x) has at least N—1 
zeros in the interior of —1, +1 and in addition the zeros x= +1. 


Qet, 


(2.5) 


1943] 465 
| 


466 G. SZEGO : [May 


3. Proof of Theorem B’. Let us start with certain preliminary remarks on 
Jacobi polynomials P(x). For arbitrary real values of a and 8 we use the 
definition [see Szegé, loc. cit.() p. 61, (4.21.2) ] 


=1; 


(3.1) 


Then y= P(x) satisfies the differential equation (@+n(n+a+8+1))y=0 
[Szegé, loc. cit. p. 59, (4.2.1) ]. Furthermore, except for an additive constant 
[loc. cit. p. 62, (4.21.7) ] 


—1 (a—1,6—1) 


(3.2) f +o + 


We also note Rodrigues’ formula [loc. cit. p. 66, (4.3.1) ] 


(1 — x)"(1 + 2) 


n,n -1 nthe 
(3.3) = (— 1)"(2"nl) (d/dx)"{(1 — + 


From (3.1) we see that P& (x), n21, is of the precise degree » provided 
a+6#—2, —3, positive integer, is still 
of the precise degree m provided n>. 

In case a> —1, 8>—1 we conclude from (3.3) in the familiar manner the 


orthogonality relation 


a B 
(3.4) f (1— 2) (1+ 2) P,  (x)q(x)dx = 0 
-1 

where g(x) is an arbitrary polynomial of degree »—1. Now let a and 8 be 
arbitrary real and let m be the smallest non-negative integer such that 
a+m>-—1,8+m>-—1. Taking n=2m-+1 and g(x) =(1—x?)"r(x) where r(x) 
is an arbitrary polynomial of degree » —2m—1 we find that for this particular 
type of polynomials g(x) the orthogonality relation (3.4) still holds. 

Under the same condition we have [loc. cit. p. 62, (4.21.6), p. 67, (4.3.3) ] 


+1 


(3.5) m + a+ 1)0(n + 6 + 1) 


= (-1)"2 
+ a + B + 2) 


1943] JACOBI POLYNOMIALS 467 


After these preliminaries we proceed to the proof of Theorem B’. First 
let us exclude the case a+ = —/—1, | positive integer. We expand f(x) in 
a series of Jacobi polynomials P&*™®*+™ (x) ; 


(3.6) f= 


Term-by-term integration and use of (3.2) furnishes 


(3.7) f=) = (2) + (nt atB+ 2m)(n+a+ 2m —1) 


(a,B) 


(2) 


where ¢(x) is a polynomial of degree m—1 [for m=0 we have $(x) =0]. Since 
in this case P(x) is of the precise degree m we can write 


(3.8) Ke) = 
Obviously 


n=O 


Now let & belong to a certain infinite sequence such that the corresponding 
functions (# —c)*f(x) have a fixed number, M say, sign changes; MS N(*). We 
denote the abscissae at which these sign changes take place by x1, x2, - + + , XM; 
x,=x,(k). Then if = +1 or —1 is properly chosen, 


(3. 10) x — + — c)*f(x)} (x — 
+++ au)(1 + x)dx > 0. 


Here p is an arbitrary non-negative integer and 5 does not depend on p. 
Substituting for (@ —c)*f(x) its expansion (3.9) the arising integrals will all 
vanish provided n»>2m+M-+p. However for n=n’=2m+M-+p we obtain 


+ 8(— 1)*[c + + a+ 6+ 1)]* 
+1 
f "a+ x 


-1 


B+m _ (a,8) M+p 


Py (x)x dx, 


and the last integral is different from 0 because of (3.5). Hence if ¢,-+0 we 
find for k-« 


(*) From here on we use the argument of the paper cited in footnote 6. 


| 
1 
| 
| 
| 
| 
| 
| 


468 


which is impossible provided 
|o+n'(n’ +a+B+1)|> 
vgn’—1 


This is the case if n’=>n)o=n(a, B, c). 

The previous argument furnishes ¢,=0 for n22m+M, n2m, which is 
equivalent to the assertion of Theorem B’. 

In case a+ = —1—1, / positive integer, this proof needs a slight modifica- 
tion. We integrate then only the terms m2m-+1 in (3.6) and conclude (3.7) 
with the modification that the summation is now extended over the range 
n=m-+1 and ¢(x) is a polynomial of degree 2m. [The expression in the braces 
of (3.7) is then positive since 2m+a+8+2>0.] As a further addition to the 
previous argument we have to show that 


(8 — c)*o(x) = O(1)| c+ n(n’ +a+B+1)|*, ko, 


uniformly for — 13x S +1 provided m’ is sufficiently large, n’ =m, = m,(a, B, c). 
But (@—c)*¢(x) is a polynomial of degree 2m and the last assertion follows 
if we can show that the coefficients of this polynomial have moduli at most 
RS*; here R>0O depends on f(x), a, 8, c and S>0 depends only on a, 8, c. 
Now 

(8 — c)x* = h(h — 1)(1 — 


+ —a— (a+ B+ — ca; 


hence with arbitrary constants A, 


(3.11) 


: 2m 2m 
h=0 h=0 


where 
(3.13) S = 2-2m(2m — 1) + 2m| 8B —a| + 2m\a+B+2|4+ 


This furnishes the statement by taking for R the maximum modulus of the 
coefficients of ¢(x) and choosing S according to (3.13). the: 

Theorems B and B’ remain of course true if the condition regarding 
(8 —c)*f(x) is satisfied only for an infinite number of values of &. 


STANFORD UNIVERSITY, 
STANFORD UNIVERSITY, CALIF. 


G. SZEGO 


ON STRUCTURES OF INFINITE MODULES 


BY 
R. E. JOHNSON 


Much of the literature on the structures of modules applies to those 
modules which possess a finite basis. The present paper is the development 
of a structure theory for particular infinite modules with countable bases. 
Generality of results is not as much the aim of the paper as is the application 
to problems concerning infinite matrices. 

For a commutative field P, Z is assumed to be a universal P-module 
which has a countable P-basis. A principal ideal ring Q which contains P is 
considered as an operator domain of 2. Then the main topic studied is under 
what conditions submodules of = have proper Q-bases. 

In the first place, a complete characterization is given for the proper 
Q-bases of any Q-submodule of &. This is represented as an infinite matrix, 
and is called the characteristic matrix of the submodule. 

The finite case is studied in the third section. The results obtained are 
comparable with those of Ingraham and Wolf [3](*) and Chevalley [1]. The 
principal theorem is that every Q-module which possesses a finite Q-basis 
has a proper Q-basis. 

The concepts of primitivity—defined somewhat as Chevalley defines it— 
and index play an important role in determining conditions for a Q-module 
to have a proper Q-basis. In order to find these conditions, the non-regular 
elements H of & are split from 2. The resulting Q-module Z/H is regular. 
Then necessary and sufficient conditions are found for both H and Z/H to 
have a proper Q-basis. 

If the operator domain of © be considered as Q/(m), m not a unit of Q, 
then in the fifth section it is seen that © possesses a proper Q/(m)-basis. 

As an application of these results, Z is taken to be the set of all vectors 
over P of order type w which are finitely nonzero. The total operator domain 
of & is a certain ring of infinite matrices, Jt,. Then any element A of M,, can 
be transformed into a direct sum of finite matrices only if & has a proper 
P[A ]-basis. 

The algebraic theory assumed herein can be found in almost any book 
on modern algebra—specific attention is called to MacDuffee [4] and 
Zassenhaus [5]. 

1. Introduction. Let P denote a commutative field, and Q a principal ideal 


Presented to the Society, April 18, 1942; received by the editors April 28, 1942, and, in 
revised form, October 13, 1942. 
(*) The numbers in brackets refer to the bibliography at the end of the paper. 


469 


| 
| 
| 
| 
| 
| 
| 
] 
; 
| 


470 R. E. JOHNSON . [May 


ring which contains P(*). The universal P-module (linear set over P) of all 
modules discussed below will be labelled by %. It is assumed to have Q as an 
operator domain. While & will in general have an infinite number of elements, 
yet only finite sums are ever considered. Small Greek letters will always de- 
note elements of Z, capital Greek letters will stand for subsets of 2, and 
small Latin letters will be used for elements of Q. 

For subsets 21, Ze,---+ of Z, Z:\V/22\/--+ is used to denote the least 
P-module in Z which contains all Z;, while Z,/)\22/\ --- denotes the set- 
theoretic intersection of all Z;. If the Z;, i=1, 2,---+-, are P-modules, and 
=’=2iV22\V ---, then &’ is the supplementary sum of the &;, written 


B= 
in case the representation of every element of ©’ by sums of elements in the 
2; is unique. This is equivalent to the condition 
Ee /\ (Ei V 22 V V Bes) = 0, 


For =.C 21, 21— 22 is the set of all elements of Z; not in Zs. 
The universal P-module & is said to have the P-basis (£1, £, - - - ), finite 
or infinite, if 


= Plés, |, 


this last being the set of all finite combinations Dak, a;€P. If the set 
&, ) is P-linearly independent—that is, for every finite sum ).a,é;=0, 


all a; =0—this basis is called regular. In this case one has 
Z= Pi: t+ Piot+---, = 
The following axiom is assumed throughout the paper. 


FUNDAMENTAL AXIOM. The universal module = has a countable P-basis. 
A consequence of this axiom (see Ingraham [2]) is 
THEOREM 1.1. Every P-submodule =; of = has a proper P-basis. 


The set (1, &, - - - ) isa Q-basis for 2; in case Z:=Q[1, &, - - - ]. The set 
=, is a Q-module in case for every 

DEFINITION 1.2. A set of elements (£1, &, +--+) is a proper Q-basis of Ei, 
and ts thus Q-linearly independent, if and only if 
(1) & #0, i= 1,2,-- 
and 
(2) = + 


(?) As a particular instance of these concepts, we can consider P as the rational field and Q 
the ring P[x] of all polynomials in the indeterminate x with coefficients in P. The first part of 
§6 might well be read first to give one a concrete example of the sets Q and =. 


1943] STRUCTURES OF INFINITE MODULES 471 


An element a of Q is an annihilator of é if a =0. If a; and ag are two an- 
nihilators of £, then b,a:+dea2 is an annihilator of ¢ for any two elements 
b; and be of Q. Thus the set of all annihilators of form an ideal. Since Q is 
by assumption a principal ideal ring, this ideal is principal of the form (h). 
As h divides all annihilators of &, it will be called a minimum annihilator of é. 
It is unique up to a unit factor. 


DEFINITION 1.3. If a&=0 implies a=0, then & is called regular. If every 
nonzero element of a Q-module =, is regular, then &, is called regular. 


If £ and 7 are two non-regular elements of & with minimum annihilators 
a; and de, respectively, then for any elements 3; and be of Q, dbi¢+ben is 
annihilated by ai:a2. Thus the set of all non-regular elements of & forms a 
Q-module which we shall label H. Suppose (m, 72, ---) is a P-basis for H, 
with 4; the minimum annihilator of 7;. Then we have defined a set (fi, pe, - - -) 
of primes of Q, which are all the distinct prime factors of the h;,7=1, 2, - + - 
(p and g are not distinct if p=gc, c a unit). Now any element 7 of H is of the 
form n=>_7.14,%, @:€P, so 7 is annihilated by some product of the primes 9;. 
For any element # of Q, let H, be the set of all elements of H annihilated by 
some power of p. Then H, is a Q-module. 


THEOREM 1.4. H=H,,+H,,+---. 


To prove this, let 7 be any element of H, and let [[?_,p# be its minimum 
annihilator. Then there exist elements s; such that 


Il = 1. 
jel 


n 
t 
nj Sj II Pir, 


As an immediate consequence of this theorem, we have that for any 
Q-module A, CH, 


Any P-module &; is fundamentally an abelian group, so that the quotient 
group 2:/:2 is well defined for any P-submodule 2 of &, and is itself a 
P-module. Likewise, if Z, and Zz are Q-modules with =2.C Zh, then 2/2 is a 
Q-module. 

If H, is the set of all non-regular elements of the Q-module 2, then 2:/H; 
has elements of the form (+) (this should not be confused with the supple- 
mentary sum—it means the set of all elements of the form +7, 7€Aj) tor 
Suppose a(£+Hi) = (Mi) for some element a€Q, Then 
so that a=0. Thus is a regular Q-module. 


} 
| 
| 
| 
| 
j 
| 
| 


472 R. E. JOHNSON . [May 


2. Invariants of proper Q-bases. Let =, a Q-module, have a proper Q-basis 
(m, m2, °° +, &, +++), the & being regular elements, and 7; having mini- 
mum annihilator g;,7=1, 2,---. If AH, =Q[m, % =Q[é:, 
then A,CH. 


DEFINITION 2.1. The infinite matrix (n,.; r=0, 1, 2,--+;s=1,2,---), 
in which n,, is the number of the q;, 7=1, 2,-++ , divisible by pf but not by 
pit? for r,s=1,2,+++,and mo, is the cardinal number of the set (&, &, +--+), 
s=1, 2,-+-+, ts called the characteristic matrix(*) of the proper Q-basis 
(m, °°, &, +++). The elements of this matrix are integers or No. 


THEOREM 2.2. The characteristic matrix is an invariant of the class of all 
proper Q-oases of Zh. 

To prove this, let (na, nj,---, En, Ea,++-), 7=1, 2, be two proper 
Q-bases of Zi, with the £;; regular and ;; annihilated minimally by qj, 
2,---+. The number of elements of the set (qj, giz, - ) divisible by pf 
but not by p3** is denoted by mjrs, 7, S=1, 2, - - - . The cardinal number of the 
set (En, £2, +++) is denoted by mjo., so that We shall first 


show that M111 = 
Select r;; from Q so that gi;=pf"r;; with the greatest common divisor of 


pi and r;;, denoted (1, 7:;), equal to 1. The nonzero elements of the set 
(ra Na, 22, are Q-linearly independent: let us relabel these nonzero 


elements (aj, Then 

(1) + - = + Qasr +--:, 

this being the set H,, of all elements of H, annihilated by some power of #. 
If 211 =*9n—-No, the first step of the proof is concluded. Thus assume 

=n<No. Let us separate the a;; into the sets (By, By, ---), (Ya, Ya, 

with the first set being all the a; annihilated by 1, and the second set the re- 

mainder of the a;;. It is observed that no member of the set (Bn, Br, « - + , Bin) 

could be in the module Q[ya, Y22, - ++ |. For if 


t 
Bu = 
rar 


then each c; must be divisible by : as p; annihilates 8y,, but does not annihi- 
late any yj. Thus Bu=pry. However, yCQ[Bu, Ba, - Yu» |, say 


DdBut+ Dd en 


so that Bu =)-{_:f:ev71s, which is impossible. From (1), we have 


(*) This bears no relationship with the ordinary concept of characteristic matrix. The name 
was chosen because of the connection between this matrix and the characteristic divisors of 


certain infinite matrices, 


r 


STRUCTURES OF INFINITE MODULES 
mi 
Bus = biz + 513 € Olver, 


Bos = >> + prides, - 
jel 
A substitution yields, if we let m be the maximum m,, and };;;=0, m;<jsm, 


Bu = Dd + 


j=l k=l 
Ba = Dd bei + pide. 
j=l k=l 


From the Q-linear independence of the sets (Bj, By, >>, Yi Yam -++) one 
must have (16: = f152=0. If the matrices B, and By are defined as 


(2) 


Bi = (bes = = 1,---,m), 
Be = (bore; 7 = 1,-+-,m;s =1,--+,m), 
then from (2) one can conclude (using J; as the unit matrix of order k?) 


However B, and By have elements in the field Q/(:), so that m and m must 
be equal. Also, from (2), for ¢>n, 


Bor = >> D> baisdi 
jul k=l 
which is impossible, as the set (G2, B22, - - + ) is Q-linearly independent. The 
conclusion is that there are m elements in the set (G2, Be2,--+), so that 
M111 = N211- 

In order to show that m;= 2, consider the set p{-'H>,, which has the two 
proper (if we exclude the zero elements) Q-bases (pf ---), 
j=1i, 2. From the paragraph above, the number of nonzero elements annihi- 
lated by f: in each basis is the same. This number must be the number of gj 
divisible exactly by that mu, 

We need only to show that 1191: = 20, to complete the proof. Let 


fj = + > j=1,2,---, 101) 
k k=l 


£25 > + dantu, j 2, 5 201- 
k 


Case 1. Let-101 = No, 20: finite: if m be the maximum m;, 7 =1, 2, - ++, mea, 
then 2,CQ[nu, m2, £1, £2, €1m] which is impossible. 


1943] 473 

j 
| 
| 
| 
| 
| 
| 
| 
| 


474 R. E. JOHNSON ° [May 


Case 2. Let 101, "20: both be finite: then there exists an element cG@Q such 
that 


n2o1 


m101 


If Di = (dirs), R=1, 2, then by the method above 
= CD2D, = CI 


As Q can always be imbedded in a field, we can consider D; and D2 as having 
elements in a field, so that 9; = 20. Thus the characteristic matrices of the 
two bases must be equal. 

Consider again the proper Q-basis for =; given at the beginning of this 
section. If g,:=cpi - - - pf, c a unit and all ¢;21, then let 


ty tn tt tn ty ta-1 
mi = pig? ** M2 = Min = * 


We see that =Qnu+Qme+ +Qma where each m; is minimally an- 
nihilated by a power of a prime. Now this can be done for every 7;, so it is 
apparent that has a proper Q-basis (G1, B2,---, &, &, +--+), with the & 
regular and the @; possessing powers of primes as minimum annihilators. 

With the use of this basis, the completeness of the invariant characteristic 
matrix (”,,) will be shown by the following 


THEOREM 2.3. For any set (ri, f2, - - - ) of elements of Q such that the number 
of these elements divisible by p} but not by p}** is mys, there exists a proper Q-basis 
(au, +++, 2, of Bi, with 1; the minimum annthilator of a;, and ny 
the number of elements in the set (£1, &,---). 


If r,=cpf - - + pit, all 4;>0 and ca unit, then let j, be the minimum integer 
such that §;, has minimum annihilator pf, and, in general, let j, be the 
minimum integer such that §;, has minimum annihilator pf. Now define 
a, =6;,+8;,+ --+ +8; so that a has minimum annihilator 7. Discard 
Bi, Bi ++ + » By from the set (G1, 62, - - - ), and with the remaining set carry 
through a similar process for 72, obtaining ag. From Theorem 2.2, there will 
be precisely enough elements in the set (6;, 2, - - - ) to carry this process to 
completion, using all the rj’. We can use the same regular elements in the 
new basis as in the old one. This completes the proof. 


CoroLuary 2.4. If all the elements of H, are annihilated by some power of 
the prime p, and H, has a proper Q-basis, then the number of elements in any 
proper Q-basis of H, is an invariant of Hy. 


1943] STRUCTURES OF INFINITE MODULES 475 


If a proper Q-basis has an infinite number of elements, obviously any 
Q-basis has an infinite number of elements. Suppose (m, - - - , 72) is a proper 
Q-basis of H,, and (a1, - - +, @m) any Q-basis. Then 


m n 
ni = a;= 
j=l kel 


if j=h, 
j=l 
= 0 mod otherwise, 
which is possible only if m2n. This leads to 


THEOREM 2.5. If H, has a proper Q-basis, then the number of elements in 
any Q-basis cannot be less than the number of elements in a proper Q-basis. 


3. The finite case. Before the general case is studied, it is necessary to 
consider those submodules of © which have finite Q-bases. 


3.1. If has a finite Q-basis, and Ei, a Q-module, then Ze 
has a finite Q-basts. The Q-basis for He can be chosen so as not to have more 
elements than the given Q-basis for =a. 


To prove this, let (£1, ---, be a Q-basis of For any mn, there 
exists a maximum ideal a, CQ such that 


= Omod Ze V Ema]. 
As Q is a principal ideal ring, dn = (Sm). Select 7m€Z2 so that 
Im = Smém mod tm =0 if smn =0. 
Suppose Zs, so that with Then 
= mod Q[E1, , fer], 
which implies 
n = bin, mod &2, , Ex-a]. 


Thus +, $1], and by induction 7 This shows 
that (m, m2, is a Q-basis for Zz, and establishes the lemma. 

If (&, &, +--+, &) is a P-basis for the Q-module 2, then for any element 
and any element 5€Q, not a unit, the set (, - - - , must be 
P-linearly dependent. Thus 57.0,‘ =0 for some elements a;€P; we have 


Lemma 3.2. If 2%, is a Q-module with a finite P-basis, then =,CH. 


Lemma 3.3. If 2, ts a Q-module with a finite Q-basis, and =,CH, then 2, 
has a proper Q-basts. 


so that 

| 

| 
| 
| 

{ 
| 
| 


476 R. E. JOHNSON ~ [May 


To prove this, let Z:=Q[&, &, ---, &.] with ; annihilated minimally by 
r;, so that annihilates Let (p1, , Pm) be all the distinct prime 
factors of the r;,i1=1, 2,---, 2. If H; denotes the set of all elements of 2 
annihilated by some power of ~;, then from Theorem 1.4, 


A: = + Hn. 


We will prove that each H; has a proper Q-basis, which will imply a proper 
Q-basis for 2h. 

Denote by & the minimum integer such that pH, =0. Then there exists 
an element 7;€H; for which p? is the minimum annihilator. Recursively, if 
t, is the minimum integer such that »#H,=0 mod Q[m, i m-1], 
then there exists an element 7,€H; which has p* as minimum annihilator 
mod Q[m, Ma, It follows that --- 2h. 

Assume that the set m2, , is Q-linearly independent, and that 
From above, we must have 


= ani. 


If this equation be multiplied by the (&1—&)th .power of 1, it is appar- 
ent that (p# divides ax1). Similarly,tit can be verified that p¥|a;, 
4=1,2,--+,k—1, so that a;=b,p*. Then if 


ik = 1% — 


the set (m, 72, °° *, Hx) is Q-linearly independent. If this process were 
not finite, we would have a submodule of &, containing an infinite proper 
Q-basis. This is not possible in view of Theorem 2.5 and Lemma 3.1. 


THEOREM 3.4. If 2 is a Q-module with a finite Q-basis, then Z, has a proper 
Q-basis. 


This follows from Lemma 3.3 if all the elements of Z, are non-regular. 
Thus let = 2: so that 2,/M; is regular. As has a finite Q-basis, 
so must =;/H;: denote this basis by (£1, &, all &:0. 

Assume every Q-module contained in 2,/H; which is generated by k 
or fewer elements has a proper (finite) Q-basis. Also assume that the set 
(£1, &, - + +, &) is Q-linearly independent, and that & 4; has a nonzero mini- 
mum annihilator ge: mod Q[&, &,---, &]. Then there exist elements 
gy such that 


k+1 
qits 0, (q1, q2, = 1. 


t=1 


In the case under consideration, it is well known (see MacDuffee [4, 


k-1 
[| 


1943] STRUCTURES OF INFINITE MODULES 477 


p. 227]) there exists a matrix C=(c,,) of (+1)? elements with 
j=i, 2, , k+1, such that |C| =1 and C has a unique inverse B=(6,,). 


Let 
k+1 


=D cin #=1,2,---,&k 
jul 


Remembering that we see that 


bums = = t= 1,2,---,k+1. 
Thus Q[au, a2, +++, a] ++, By assumption, this set has a 
proper Q-basis. An obvious induction leads to a proper Q-basis for 2:/H. 
If this basis is (8:+H1, Be 6: then 


Ei = Hi + + +--+ + 


and the theorem follows. 

4. The year case. We now turn to the consideration of any Q-sub- 
module of 2, and develop conditions under which it possesses a proper 
Q-basis. 


DEFINITION 4.1. For the Q-modules 2), Zs with ZC Fi, Ze ts called primitive 
in if, for every such that pEC Es, pEX0, there exists an element 
for which pi; =0, and Ze. i 


DEFINITION 4.2. If 2, is a Q-module, then the index of an element ECE, 
written i(£, 2), is defined as follows: 

(1) For NH, £ minimally annihilated by p; a prime of Q and 
t;21 for j=1, 2, , k, «(&, Bi) ts the maximum integer s for which there 
exists an element EE. such that 


k 
‘t= dK sy=s. 
jul 


(2) For & regular, i(, 2) is the maximum integer s for which there exist 
primes q:, 92, Qn in Q and such that 


k k 
€ = ¢jnmod A 
j=l 


If in either case this maximum does not exist, i(€, Zi) = ©: 1(0, Zi) =0. 
If =, has non-regular component Hi, and Mi, then obviously i(n, 
=1i(, 21). Also, for Z2C Zi, Be, 22) Si(n, 21). 


THEOREM 4.3. Let be a Q-module, and ---, Hy; 
=2:/\H,,. Then for any so that 


| 
| 
| 
| 
{ 
| 


R. E. JOHNSON ° 


i(n, Zi) = i(ns, 


jul 


To prove om, we first refer back to Theorem 1.4. If » has minimum 
annihilator [ [7.:/%, then there exist elements a;€Q such that 


(1) Il pi = 1, n3=a; IJ pi'n. 


If i(n, 21) =s < ©, then there exists an a€ Z, such that 
(2) n= II 2: a, 
i=1 j=l 


From (1) and (2) we derive 


tml 


so that i(n;, 2:1) Thus 


n 
j=l 


On the other hand, let i(;, 21) =7;, so that 


bing = Xj, j=1,2,-++,m. 


From (1), this implies 
(3) eI] n= a=)>oa; [I 
j=l 


As (a, =1, 7=1, 2,-+-, m, - - =1 for some If we 
define c; and d;, 7=1, 2, - +--+, m, as solutions of the equations 


t 
Cj Il 2 =1 
tml 
then for 8;=c,a;, 7=1, 2,.--+, it follows that 
rett; 
a;= By 

iss j 

A substitution of this in (3) yields 


II = I (ay pax), 


ful 


478 [May 

t;—1 

’ j=1,2,--++,n, 
j=1,2,--+,m. 


STRUCTURES OF INFINITE MODULES 


i(n, Za) = i(ns, 
jel 

In case i(n, 21) = ©, it will be possible to find an @ in (2) for which s ex- 
ceeds any given number. Thus it must be possible to find an a for which one 
of the s; exceeds any given number. This implies ¢(;, 21) = © for some value j. 
Thus the theorem is seen to hold in all cases. 

If »€H,, p a prime, and 7 minimally annihilated by p‘, then for any 
acQ, a=p'a, with (a, p) =1, 


i(an, Hy) = i(n, +1 
in case r<#. This immediately leads to the following 
4.4. For n minimally annihilated by and 
a=a,| a, a unit, man, and an+0, 
i(an, = ifn, + 


For a regular element & of Ea, 


jul 


The importance of the concepts of primitivity and index in connection 
with our problem is seen in the next theorem. 


THEOREM 4.5. Let the Q-module Zs be primitive in the Q-module Zi, and 
either =,CH, for some prime p or En be regular. Then, for any element & of 
such that i(t, Bi/Z2)=0 and & has the same minimum annihilator 
mod 22 and mod 0, Z2+Qé is primitive in Fa. 


In the first place, if =,CH,, then & will have some power of p, say p', as 
minimum annihilator. By assumption, =0 mod implies =0, so that 
the sum =.+Qé is supplementary. Let p* be the minimum annihilator 
mod (22+) of an element 7€ Zi, with p*n#0 mod Ze. Thus p'n =aé mod Ze 
for some a€Q, so that 2/22) 2s. Ifa (b, p) =1, then i(aé, =r 
by Corollary 4.4. Thus s Sr, so that p*(n—p*-*bt) =0 mod 2». As 2s is primi- 
tive in 2, there must exist an element & of 2: for which p*t:=0 and 
Be. This last can be written 2.+Qé, which establishes 
the primitivity of 2.+Qé in 

In case 2; is regular, let » be a nonzero element of =, for which 
an=0 mod =2+Qé, a0. This means that there exists an element €Q such 
that an=bE mod Zs. If (a, b) =d=ra+sb, b=dbhi, a =da, then let &=r&+sn. 
We then see that af;=dé mod Zs, or, because of the primitivity of Zin =, 


1943] 479 
so that | 
| 

| 
| 
| 
| 
| 


480 R. E. JOHNSON _ [May 


mod Ze. As =0, a; must be a unit of Q so that and 
n =a; mod Zs. Thus and the theorem is established. 

Now suppose 2; is a Q-module, and Z,/H; has a proper Q-basis, H; being the 
set of all non-regular elements of Z;. Let this basis be (:+H:, +, --- ). 
If =0 mod then a;¢;=0 mod Mi, i=1, 2,---, m, so that &;=0, 
Thus 


Ar = 
and the following theorem is seen to hold in view of Theorem 1.4. 


THEOREM 4.6. If 2; is a Q-module, then Zi has a proper Q-basis if and 
only if 

(1) each nonzero has a proper Q-basis,i=1, 2,-++, and 

(2) AH has a proper Q-basis in case 2, /\H. 

THEOREM 4.7. If E; has a proper Q-basis, then all the elements of EZ, have 
finite index in 

To establish this, suppose first that = © for an element H/ 
From Theorem 4.3, we see that there must exist an integer r and an element 
a€H,,/\ 2, such that i(a, Hy, \ = ©. Let Hy, AZ: =Qm+Omt+ and 
a=)-%.,a.m;. By assumption, if a has minimum annihilator p!, there exist 
elements and integers such that 


a= pray j=1,2,-+> 
If a; ain, then 
am = plas, #=1,2,-++,m;7 


However, each 7; is annihilated by some power of p,: therefore this last equa- 
tion implies that ~{-'a =0, which contradicts our assumptions. Thus no non- 
regular element of &; can have infinite index. 

Now assume for a regular element of Let a proper 
Q-basis of =; be (m, m2, +--+, &, &, +--+), with the 7;€H and the &; regular. 
Thus pt =p)_7_,a:¢:, p¥0. By assumption, there must exist nonzero elements 
qi and such that =7,9,8;, =1, 2, - - -, with the number of 
prime factors of g; increasing with 7. If 


t=—1 


then 
Thus 


1943] STRUCTURES OF INFINITE MODULES 


br (as — = O, 1,2,---, =1,2,---, 


so that 
= #=1,2,---,m;7 =1,2,--+: 


If a; has ¢; prime factors, then j can be taken so large that g; has more than #; 
prime factors, i=1, 2, - - - , m. This is impossible, so that no regular element 
of =, can have infinite index. This establishes the theorem. 


THEOREM 4.8. Let H; be a Q-module with the property that every element of 
Hi, is annthilated by some power of the prime p. Then H, has a proper Q-basis if 
and only if for every primitive set Hz Mi, He having a finite Q-basis, all the 
elements of H;/He have finite index. 


To prove the necessity of this condition, iet H;=Qm+Qn2+ --- and de- 


fine Ht=Qm+Qn+ -+-+Qm. For any Q-submodule Hz of Hi; which 
possesses a finite Q-basis, there exists an integer m such that H:C Hy. Thus 


= + +--+ + Qom 
by Theorem 3.4. This shows that 
= Qa; + + + + + Onnse + 


and thus all the elements of Hi/H: have finite index by Theorem 4.7. 

To establish the converse, let (m, m2, -- +) be any Q-basis of H;. Assume 
that I is a Q-module primitive in Mi, has a proper Q-basis (a1,a, , 
and t(a;, =0, 7=1, 2, m with T°=0. Also assume 


Qlm, m2 m1] 
If =r>0, with p* the minimum annihilator of 7, mod I, then 
there exists an @m41€H; such that 

Thus ¢(@m4:, Hi/T!™) =0, and by Theorem 4.5, ['"*+!=I'"+Qam4: is primitive 
in H,. If p* is the minimum annihilator of 7, mod I'*+!, then #<t. As above, 
there exists an Qmi2 Hi/T"t!) =0, and ['"*!+Qan42 is primitive 
in H;. There will exist an integer m and elements * » &m+n 


with =Qai+Qaet+ 1(a;, Hi/Ti) =0, j=1, m+n, 
and I'™** primitive in Hy, such that 


Qln, Na" C A. 
Thus, by induction, there exist elements a1, a, - - - GH; such that 
= Qari + 


Coro.iary 4.9. If the Q-module ZH, is annthilated by a nonzero element 
hEQ, then Ey has a proper Q-basis. 


481 
| 
| 
| 
| 
| 
| 
| 
| 
| 


482 R. E. JOHNSON 


If then 


Thus for any Q-module H/ CHi, all the elements of H;/H/ have finite index, 
and the corollary follows from Theorem 4.8. 


THEOREM 4.10. Jf thew has a proper Q-basis if and 
only if, for every regular Q-module K C2, such that K has a finite Q-basis and 
HA is primitive in all elements of 2:/2:/\H+K have finite index. 

The proof of this theorem is similar to that of the last theorem. If 
=:/\H=H,, and = has a proper Q-basis, then 2:=H,+%, Q a regular 
Q-module. Let (£:, &&,--+) be a proper Q-basis for 2. For any regular 
Q-module K which possesses a finite Q-basis there exists an integer m such 


that 
Ai+ KC Ait + +--+ + 


As in Theorem 4.8, we see that 2,/H,+XK has a proper Q-basis, and thus all 
its elements have finite index. 

Conversely, let Z:=Q[(:, Bs, - - - ]V Ai, all B; being regular. Assume that 
we have found regular elements £1, &, +--+, £m in 2: which are Q-linearly 
independent such that, if f*=Q&+Qi&+ --- +Q&+, 


(1) = 0, j=1,2,--- 
and 
(2) Ba, » Beal V Ai CT™C 


Thus is primitive in If #0, then there exists an element 
© 21 such that 


By = mod i(Em41, = 0. 


Thus 
Q[6:, ---, Cr™ CH, 


and the theorem follows by induction. 

It is always desirable to have the important properties of any set carry 
over to “admissible” subsets. In this case, the property of a module having a 
proper basis should carry over to submodules. This was seen to be the case 
for P-modules in Theorem 1.1. That such is also the case for Q-modules is 
demonstrated in the next theorem. 


THEOREM 4.11. If Zi has a proper Q-basis, then any Q-submodule Ze of Za 
also has a proper Q-basis. 


[May 


1943] STRUCTURES OF INFINITE MODULES 483 


This is a consequence of Theorems 4.8 and 4.10. Let H;=H,,/ &;, 7 =1, 2, 
and assume H;~0. Then for any primitive set H;CH:2, Hs having a finite 
Q-basis, and any element Ag, H2/Hs) S1(n, Ai/ Hs). Now, even though 
H; need not be primitive in H;, i(n, Hi/H:)< ©. Thus H2/H;)< © so 
that H2 has a proper Q-basis. A similar argument also shows that 22/H/\ 2 
has a proper Q-basis, so that =, has a proper Q-basis by Theorem 4.6. 

5. The modular case. The statement that Q is an operator domain of = 
carries with it the assumption that af =dé for all {€Z implies a=b. We will 
now consider the case in which af = for all EZ implies a=b mod h, hk not 
a unit. This is equivalent to the statement that Q/(%) is an operator domain 
of Let R=Q/(h) and h=|]{_,p%, primes of Q and 
As above, PCR, and the Fundamental Axiom will still be assumed. 

If 4 is a prime, then R is a field and & is a regular R-module. Otherwise, 
it is apparent that the minimum annihilators of the non-regular elements of 
& are divisors of h. In this case, if ¢ is regular, p:¢ is non-regular, so that in 
view of Theorem 1.4 we have 


(1) + Ap, 
THEOREM 5.1. The set % has a proper R-basis. 


If R is a field, this follows from Theorem 1.1. Otherwise, the proof will be 
to show that H,, has a proper R-basis, which will lead to a proper R-basis 
for = in view of (1). For simplification, let H,,=H, =n, pi=p, and de- 
fine H; to be the maximum submodule of H annihilated by p‘, OSisn, H)=0. 
Thus CH,=H. 

The R-module Hy,/Hm1, 0<m S12, is annihilated by p. Let (£1, &, - - - 
be a P-basis for this R-module. Now discard all £;=0 mod R[é:, &, - - + , &-1] 
V Has. If we label the remaining set (£1, &, ---), then this set is a proper 
R-basis for H,,/Hm-1 and is R-linearly independent in H,,. For, if 


k 
akg = 0, a, 4 Omod 


then a.¢,=0 mod &, &-1]\VHm—1. However, as a, has an inverse 
mod this implies that mod R[&, &, ---, &-1]\VHm-1, which is im- 
possible in view of the method of selection of &. 

Assume that H/Hn4: has a proper R-basis (a1, ae, - + - ) for some integer 
m, 0Sm<n-—1, this basis being a R-linearly independent set in H. Let 
(81, Be, -- +) be a proper R-basis for Hn4:/Hm. Now discard all 8; such that 


B; = 0 mod ] V An V B;-1]. 
Denote those remaining by (71, Y2,-°-:). From the way that the set 


(*) The basis is in reality ($:+Hn-1, §2-+Hm-1, * * * )—see the last paragraph of §1. 


| 
| 
| 
| 
| 
| 
| 
| 
i 


484 R. E. JOHNSON * [May 


(01, G2, ***, Yt» Y2, ** * ) was chosen, it can be seen that it is a proper R-basis 
for H/H,. To show that this set is R-linearly independent in H, suppose 


r k 

where c is not congruent to zero mod p. Then c, has an inverse mod #, so 
that 


0 mod R[e, ] V An V Tec? 


This contradicts the method of selection of y,. Thus, by induction, H=H/Ho 
has a proper R-basis. 

6. Applications to infinite matrices. The ordered set (1, 2,---,,---) of 
type w will be denoted by A. In what is to follow, x will denote a commutative 
indeterminate over P, and P[x] will denote the polynomial domain in x 
over P. The P-module to be used as the & above is defined as follows: 


DEFINITION 6.1. & is the P-module composed of all vectors (a;;iGA) with 
elements in P such that only a finite number of the elements in each vector are 
different from zero. Addition is ordinary vector addition. 


The vector (a;; 1€A) with a;=1, a;=0 for 147 is denoted by 4;. The set 
of vectors (61, 52, - -+ ) is a proper P-basis for Z, and thus the Fundamental 
Axiom is satisfied. 

The total matric algebra of order n* over P is denoted by M,. If A is an 
element of M,, and 7 is an element of the total vector space %, of order n 
over P, then An (considering 7 as an »X1 matrix) is again an element of &,. 
Thus A is an operator of %,, and M, is the total operator domain of ‘&,. 

A total operator domain exists for Z, and is equivalent to the ring Mt. 
below. 


DEFINITION 6.2. Mt. is the set of all matrices (ar.; 7, sGA) over P with the 
property that the vectors (dr.; rGA) are in & for all sEA. 


For A, BEM., A = (Gr; 7, SEA), B=(b,.; r, sEA), the sum and product 
of these are defined as usual—that is, 


AB =( ¥ A + B= (Gy. + 5 € A). 
é 


Under these operations of (finite) sum and product, M. is a ring. The ele- 
ment J =(a,,; 7, sGA), with a,,=0 for rs, and a,,-=1 is the unit element 
of M.. The notation above will be simplified by omitting the range of the 
indices when there is no chance of ambiguity. The notation >>; means that 
the summation is taken over A. 

The matrices of Mt, are left operators of Z under the following definition: 


{ 


1943] STRUCTURES OF INFINITE MODULES 


for A EM., with A = (dys), n=(ci), 
An = ( Dd ayes; i EA). 


One can think of Mt. also as a vector space with elements from &. If 
A =(ay.), then A =(a;; where a; = (ai;; 7EA). 

An element A of M, is regular in case A possesses an inverse in I.. The 
ring 2. possesses elements which are semi-regular—that is, elements which 
have a right inverse but not a left, or vice versa. Such an element is N defined 
below (Definition 6.10). The element A is algebraic in case there exists a 
nonzero m(x) €P[x] such that m(A) =0. Thus, if A is algebraic, there exists 
a-polynomial h(x) of minimal degree, called the minimal polynomial of 4, 
such that 4(A) =0. 

For any non-algebraic element A of M., the polynomial domain P[A] is 
a principal ideal ring. In what is to follow, these principal ideal rings corre- 
spond to the ring Q used above. In case A is algebraic with minimal poly- 
nomial h(x), then P[A] is isomorphic to the ring P[x]/h(x), and the theory 
of §5 is applicable. 


THEOREM 6.3. If (£1, &, +) is @ proper P-basis for then C=(&; 
ts a regular element of M.. 


To prove this, we see that there miust exist a;;€P such that 
Let A =(a,,); then CA =I, so that A is a right inverse of C. Now CAC—C=0, 
so C(AC—I) =0. If (AC—I) =(6,.) #0, there exists an integer m such that 
n= (ben; #0. Then which contradicts the hypothesis 
that (&, &, -- +) is a proper P-basis for Z. Thus AC=I, and C is regular. 


DEFINITION 6.4. An element A of M. is said to be reducible if and only if % 
has a proper P[A |-basis. 


The definition of direct sums of finite matrices (see [4, p. 237]) can be 
carried over to Mt,. Thus an element A of Mt, is the direct sum (+) of the 
elements A; of M, and Az of Mt, A=Ai+Asz, if and only if, for A =(a,.), 
Ai = (Girs), Az = (Gare), Gre =Cire for 7, SSM, for 7, SEA and other- 
wise a,,=0. By iteration, the direct sum of an infinite number of finite 
matrices can be defined. 


DEFINITION 6.5. An element A of M., is in reduced form in case A is the 
direct sum of finite matrices. 


As in the finite case, two elements A and B of M., are similar in case there 
exists a regular element T of Mt, such that B=T7—!AT. 


THEOREM 6.6. If A is reducible, then any element B similar to A is also 
reducible. 


| 485 
| 
| 


486 R. E. JOHNSON ~ [May 
Let B=T-AT. Now define 7,=7-&,, 

n=1,2,---. Then Z=P[B]m+P[B]n2+ - - - , so that B is reducible. 
THEOREM 6.7. If A ts in reduced form, then A is reducible. 


That this is true is a consequence of the fact that %, has a proper 
P[B]-basis for any BEM,. 

THEOREM 6.8. If A is reducible and all the elements of = are non-regular 
with respect to P(A], then A is similar to an element B in reduced form. 

Let Z=P[A]i:+P[A ---, with 4,(A) the minimum annihilator of 
&;. If ¢; is the degree of h;(x), then 


h(x) = + > 
j=l 


Now define 


so that TEM... By Theorem 6.3, T is regular in M,. If A; is the companion 
matrix(5) of h;(x), so that A;EMt:,, then 
= A, + 42+ 


and the theorem is established. 
This theorem does not include the important case of algebraic matrices. 
However, in view of Theorem 5.1, a similar proof leads to 


THEOREM 6.9. If A is algebraic with minimal polynomial m(x), then A is 
similar to an element B in reduced form. If B=B,+Bz+ - - - , then each B; és 
the companion matrix of some divisor of m(x). 


DEFINITION 6.10. The elements N and N* of M.. are defined as follows: 
N = (52, N° =0, 
N* = N%53, N*54, N%55, N%56,--- ). 
That powers of N are of fundamental importance in the study of reducible 
matrices is seen in the following 
THEOREM 6.11. If A is reducible, A not algebraic, and = is a regular 
P[A ]-module, then A is similar to some power of N. 
To prove this, let 
If n< @, let 


(®) This is the t; Xt; matrix with 1’s directly below the main diagonal, —ajo, —au1,°°*, 
—4d;4-1 as the last column, and 0’s elsewhere. 


1943] STRUCTURES OF INFINITE MODULES 


T= (1, £o, En, Abi, Aé€s, 
Then 7—"AT=N". However, if n= ~, let 
T = (&1, A*E1, Ake, &3, ). 


Then 7-'AT=WN*, and the theorem follows. 

The author intends to deal at greater length in a subsequent paper with 
matrix algebras of different order types. However, a matrix algebra of order 
type w2 will be considered briefly here. 

Denote the ordered set (1, 2,---, w+1, w+2,---) of type w2 by As. 
Let =: be the P-module composed of all vectors (a;;i€ As) over P, with only 
a finite number of the elements of any vector being different from zero. 
is the ring of all matrices (a,,; 7, s€As) over P with all (a,,; rE As) in Ze. 
Let 5; be the element of 2: which has 1 in the ith place and 0 elsewhere, 
4€Ae. Then (5/ ; 7€Ase) is a proper P-basis of Ze. 

For AGM., A’EMue, the correspondence 


A 0 
0 0 


defines an isomorphism between M,, and a subring of Pte. (Under a different 
correspondence, Pt., and Pt.2 can be shown to be actually isomorphic.) Using 
the notation of direct sum, AA’ =A +0. 

If A is reducible so that Z=P[A]ii+P[A]i+---, then define 
&} =(&;, 7€A), all a.,;=0: thus Then 


= P[A’]&i + P[A’ +--+ + PIA’ + PIA , 


so that A’ is also reducible. 


THEOREM 6.12. If A is reducible, then A’ is similar to B+ N* where BEM. 
is in reduced form and k is an integer or w. 


To prove this, let 
= P[A’|ni + P[A’]nd + + PLA’ + 


the £/ being regular and the n/ non-regular, with 4;(x) the minimum annihi- 
lator of n/. The degree of h,(x) is labeled m;. Then T’CMtue can be chosen 
(assuming finite) as 


T’ = (nf, A’nr, ,A’™* nt, A’nd, Ang, 
A'ti, +++, +++). 
Theorem 6.3 is seen to carry over for Z2, and thus 
= B+ N+, 
B reduced. If k= ~, the £/ in T’ can be arranged as in Theorem 6.11. 


487 

| 

| 

| 

| 
H 

j 

| 

| 

1 

| 

| 
| 

| 

| 


488 R. E. JOHNSON © [May 


From Theorem 2.2, we see that every reducible matrix A of Mt, has an 
associated characteristic matrix. The characteristic matrix of a reducible 
matrix is of importance in determining the similarity of matrices, as the fol- 
lowing theorem shows. 


THEOREM 6.13. Two reducible matrices A and B of M. are similar if and 
only if their characteristic matrices are equal. 


To establish this, first assume that Z=P[A]ii:+P[A]&+--- and 
B=T-!AT. Then, by Theorem 6.6, Z=P[B]m+P[B]n.+---, where 
n= T—£,. For any m(x)€P[x], m(A)&:=Tm(B)ni, so is regular with re- 
spect to P[B] if &; is regular with respect to P[A ]. Also, if m(x) is the mini- 
mum annihilator of £; with respect to P[A ], m(x) is the minimum annihilator 
of 4; with respect to P[B]. Thus the characteristic matrix of A equals the 
characteristic matrix of B. 

Now assume A and B have the same characteristic matrix. For any 
proper P[A ]-basis m, of & regular and 7; minimally 
annihilated by m,(x), m; being the degree of m;, there exists a proper P[B]- 
basis (&, ++, #1, of Z with &; regular and 4; minimally annihi- 
ated by m,(x). There is a 1-1 correspondence £;«++£;, 4;«+7; between these 
two bases. Then let 


and S be the same as JT with A replaced throughout by B. Then 
T—AT=S-—'BS, so A and B are similar, and the theorem is established. 


THEOREM 6.14. If A is reducible and regular, then A is similar to a matrix 
in reduced form. 


If = possesses a regular vector with respect to P[A], then any proper 
P[A|]-basis of Z must have a regular element £. Select T as in the proof of 
Theorem 6.13 with & the first vector of T. Then 7-'AT has its first row (or 
column) composed of zeros. This implies J7—'A T is not regular which means A 
is not regular. Thus & can have only non-regular elements with respect to 
P|A|], and the theorem follows from Theorem 6.7. 

As an example of the reduction of an element of {2,, to reduced form, take 


A = (8; + 5s, — 53, 52 — 53, — 5s + 54, — 53 + 55, — 53 + 5, --- ). 


It can be verified that A is algebraic with minimal equation x*—1=0. Then 
](6: +8) +P[A ](6:+5s)+ ---, with A*—1 the minimum 
annihilator of 6; and A —1 the minimum annihilator of 6,+46;, 7=4, 5,---. 
Then T can be chosen as 


; 
€ 


STRUCTURES OF INFINITE MODULES 


T = (61, 51 + 53, 51 + 52, 51 + 51 + 55, ) 
T- = (61, — 51 + 53, — 51 + 52, — 51 + 54, — 51 + 55, °° ) 


so that 
TAT = (52, 5s, 51, 54, 55, 56, 


BIBLIOGRAPHY 


1. C. Chevalley, L’arithmetique dans les algebres de matrices, Paris, 1936. 

2. M. H. Ingraham, A general theory of linear sets, Trans. Amer. Math. Soc. vol. 27 (1925) 
pp. 163-196. 

3. M.H: Ingraham and M. C. Wolf, Relative linear sets and similarity of matrices whose ele- 
ments belong to a division algebra, ibid. vol. 42 (1937) pp. 16-31. 

4. C. C. MacDuffee, Introduction to abstract algebra, New York, 1940. 

5. H. Zassenhaus, Lehrbuch der Gruppentheorie, Leipzig, 1937. 


UNIVERSITY OF WISCONSIN, 
Mapison, WIs. 


1943] 489 


FOUNDATIONS OF A GENERAL THEORY OF 
BIRATIONAL CORRESPONDENCES 


BY 
OSCAR ZARISKI 


In our papers dealing with the reduction of singularities of an algebraic 
surface (see [8, 11]), we were forced to devote a good deal of space to cer- 
tain properties of birational correspondences for which we could find no gen- 
eral proofs in the literature. These properties were of a general character and 
therefore could not be regarded as part of the reduction proof proper, al- 
though they did play an auxiliary role in the proof. A similar situation arose 
in our reduction proof for three-dimensional varieties (not yet published), 
but in this case the amount of preliminary general material on birational 
correspondences used in the proof was even larger and was out of proportion 
to the length of the reduction proof proper. It thus became increasingly clear 
that the procedure of treating general questions of birational correspondences 
only as and when these questions come up in connection with various steps 
of the reduction process, could no longer be followed in the case of higher 
varieties. Instead it seemed necessary—and also worthwhile for its own sake— 
to develop systematically in a separate paper the fundamental concepts and 
theorems of the theory of birational correspondences, and to do this in as 
general a fashion as possible. This we propose to do in the present paper. We 
deal here with algebraic varieties, with or without singularities, over an arbi- 
trary ground field (of characteristic zero or p). 

It is difficult to say which of our results are entirely novel and which are 
not. Since many of the results hold only for normal varieties, they would 
appear to be novel inasmuch as our concept of a normal variety is new. On 
the other hand, most of our results were known for nonsingular models. It is 
perhaps correct to say that the novelty of the present investigation consists 
in showing that most of the known properties of birational correspondences 
between nonsingular varieties remain true more generally for normal varie- 
ties. 

Of importance for the theory of algebraic functions over arbitrary ground 
fields of characteristic p is the fact that our construction of normal varieties 
which we gave in [7] in the case of algebraically closed ground fields of char- 
acteristic zero—and which carries over without essential modifications to 
arbitrary fields of characteristic zero (see II. 2)—can be extended to arbi- 
trary fields of characteristic p (II. 2). This extension is made possible by a 
theorem of F. K. Schmidt [5] and by the normalization theorem of Emmy 


Presented in part to the Society, December 31, 1941 under the title Normal varieties and 
birational correspondences (see [12]); received by the editors September 1, 1942. 


490 


i 


BIRATIONAL CORRESPONDENCES 41 


Noether. The proof of this last theorem for arbitrary ground fields (and not 
only for fields containing “sufficiently many” elements; see Krull [2, pp. 41- 
42]) is given in II.2. 

A feature of the treatment is our use of valuation theory. Our very defini- 
tion of a birational correspondence (II.1) is valuation-theoretic in character, 
and our proofs are naturally conditioned by this valuation-theoretic ap- 
proach. The characterization of an integrally closed ring as intersection of 
valuation rings, and the ideal theory in such a ring, also play an important 
role in our treatment. 

Thanks are due to Irvin Cohen for valuable assistance lent by him during 
the preparation of this paper. 

The following list of contents will give an idea of the individual topics 
treated. 

CONTENTS 


Part I, VALUATION-THEORETIC PRELIMINARIES 


. Homogeneous ideals 
. Homogeneous and nonhomogeneous coordinates 
. The center of a valuation 


Part II, GENERAL THEORY OF BIRATIONAL CORRESPONDENCES 


. Valuation-theoretic definition of a birational correspondence 

. The birational correspondence between V and a derived normal model V 
. The fundamental elements of a birational correspondence 

A question of terminology 

. The join of two birationally equivalent varieties 

Further properties of fundamental varieties 

The main theorem 


1 
2 
3 
4. 
5 
6. 
8. 
9. 
0. 
1 


Part I. VALUATION-THEORETIC PRELIMINARIES 


1. Homogeneous ideals. Let 4, m, ---, 7, be the homogeneous coordi- 
nates of the general point of an irreducible r-dimensional algebraic variety V 
immersed in an n-dimensional projective space. The coordinates n; are defined 
to within a linear homogeneous nonsingular transformation with coefficients in 
the ground field K. By the very definition of homogeneous coordinates (Zariski 
[7, p. 284]), if f(m) =0 is an algebraic relation between the 7’s over K, and if 
we write f as a sum of forms f;(n) of different degrees, then each f;() indi- 
vidually is zero. This is equivalent to saying that the polynomials f(y) in 
the polynomial ring K [yo, v1, - - - , Yn] (the y’s are indeterminates) such that 
f(n) =0, form a homogeneous ideal, that is, an ideal which has a base consisting 
of forms. 


4, Existence theorems for valuations with a preassigned center..................... 499 

Continuation of the proof of the main theorem................0.0eceeeeeeeeeeee 525 

The fundamental locus of a birational correspondence’...............000eee000++ 527 


492 OSCAR ZARISKI [May 


Let P denote the ring K[mo, m, - - -, %]. We shall also consider homo- 
geneous ideals in P, that is, again ideals in P which have a base consisting of 
forms. These are the ideals on which the homogeneous ideals of K[yo, - - -, yn] 
are mapped in the homomorphism K[y]~K|[n]. The ring P possesses rela- 
tive automorphisms 7, over K, where for any element ¢(7) in P we define: 
7x(6(n)) =d(An), NECK, A0. It is clear that if A is a homogeneous ideal in P, 
then 7,(H%) =A. Conversely, we have the following theorem: 


THEOREM 1. If K has infinitely many elements and if an ideal A in P is 
such that r,(A) CUA, for all in K (A#0), then is homogeneous. 


Proof. Let w=f(n) be an arbitrary element of & and let f(y) =f.() +f.41(n) 
+ --++-+fn(n), where f; is a form of degree 1. We take arbitrarily in K a 
set of m—s+1 distinct elements Aj, As, - , Am—s41, different from zero. We 
have, by hypothesis: 


These m—s+1 congruences imply the congruences: f;(7)=0, j=s, 
s+i,-+-+, m. Hence & is a homogeneous ideal, as was asserted. 


THEOREM 2. Jf qe, - qa] ts normal decomposition of a homo- 
geneous ideal X into maximal primary components, then each q; 1s either itself 
homogeneous or—in the case of an embedded component—can be so selected as 


to be homogeneous. 


Proof. We first consider the case of an infinite ground field K. Let p; 
be the prime ideal associated with the primary ideal q;. The infinitely many 
automorphisms r, leave Y% invariant and hence must permute the prime ideals 
Pi, Po, - - +, Pa. Consequently each of these prime ideals is left invariant by 
infinitely many automorphisms 7). It follows then from the proof of the pre- 
ceding theorem, that the prime ideals p; of & must be all homogeneous. 

If p; is an isolated prime ideal of A, then q; is uniquely determined. Since 
7x(pi) = pi, it follows that also 7,(q;) =q:, whence q; is homogeneous. 

Let now p; be an embedded prime of &. We apply to q; all the automor- 
phisms 7, (AGK, \+0) and we denote by q# the intersection of all the ideals 
7,(q;). From the very definition of a primary ideal, it follows immediately 
that q* is a primary ideal and that p; is its associated prime ideal. More- 
over, by Theorem 1, q¥* is homogeneous, and since YCq*Cq;, we find: 
a2, -- +, qa], as was asserted. 

Let now K be a finite field. We adjoin to K a transcendental u and we 
put K*=K(u), P*=K*[mo, m, - - , (we assume that is also a transcen- 
dental with respect to the ring P). By the preceding case, our theorem holds 
for the ring P* over the new ground field K*. We can draw from this the 
conclusion that the theorem also holds for the ring P, provided we first es- 


' 


1943] BIRATIONAL CORRESPONDENCES 493 


tablish certain relations between the ideals in P and the ideals in P*. If B 
is an ideal in P, it determines in P* the extended ideal P*- B. Vice versa, every 
ideal €* in P* gives rise to a contracted ideal in P, namely the ideal €*\P 
(but €* is not necessarily the extended ideal of its contracted ideal). Note 
that every element w* of P* can be written in the form: 


(1) w* = + wu"! +--+ +n), 

g(u) 
where g(u) €K[u] and w;€P. The relations which we wish to establish con- 
cern the operations of extension and contraction just described, and are as 
follows: 

a. An ideal $* in P* is the extension of an ideal in P if and only if the con- 
gruence w* =0(%*) implies w;=0(B*), for 7 =0, 1, - - - , m, where w* is written 
in the form (1). 

b. P*-BOP=8. 

c. If Bs, then The proofs are trivial. 

d. If pis a prime ideal in P, then P*-» is also a prime ideal. This assertion 
is essentially equivalent to the well known theorem that if R is an integral 
domain and if u is a transcendental with respect to R, then R[u] is also an 
integral domain. In the present case R is the ring P/p. 

Let now & be a homogeneous ideal in P and let us consider its extended 
ideal P*&% in P*. It is clear that also P*& is a homogeneous ideal in P*. Since 
K* is an infinite field, we can write 


where each q¥* is a homogeneous primary ideal. If we put q:=q*\P, we find, 
by property b: 


= » dal. 


It is obvious that if 6* is a homogeneous ideal in P*, then 6*(\P is also a 
homogeneous ideal. Hence the ideals q; are all homogeneous. Since they are 
obviously primary ideals, our theorem follows in view of the unicity theorems 
concerning the decomposition of ideals into maximal primary components. 

In addition to the relations a, b, c, d, we shall also have occasion to use 
the following property: 

e. If q is a primary ideal in P and if » is its associated prime ideal, then 
P*-q is also primary and P*-» is its associated prime ideal. 

To prove e, we denote by p* the prime ideal P*-p and we consider any 
prime ideal p* of P*-q. We have: p*>P*qDq, whence p*/\PDq. But since 
the contraction of a prime ideal is also a prime ideal, it follows that p*#7\PDp, 
whence 


(3) 


pi* p*. 


494 OSCAR ZARISKI [May 


Now let w* be an arbitrary element of p#* and let us write w* in the form (1). 
Since p* is a prime ideal of P*-q, there exists in P* an element £* such that 
w*-¢*=0 (P*-q), &*40 (P*-q). Let us also write &* in a form similar to (1): 


1 


Since #0(P*-q), not all the elements &, & are in We may as- 
sume that £)#0(q), since otherwise we can drop the term £ou* (it is permissible 
to replace —* by any element of P* which is congruent to ‘t modulo P*-q). 
Since w*t* =0(P*q), we must have, by a and b: wo£)=0(q), and consequently 
wo =0(p), since &>40(q). From this we conclude, in view of (3), that the ele- 


ment 


1 
“a + + wm] 


also belongs to p;*. Continuing in the same fashion with this new element of 
pi*, we conclude that wo, w1, - + - , @m are all in p. Hence w*C P*-p. Since w* 
is an arbitrary element of p;*, it follows that p*Cp*, and this, in view of (3), 
yields the relation: p*=p*. What we have shown is that P*q has only one 
prime ideal, namely p*, and that is exactly what is asserted in e. 

The relation b shows that there is a (1, 1) correspondence between the 
ideals $ in P and their extended ideals P*% in P*, for P*8,=P*B, implies 
%, = B.. By the property c, this correspondence is an tsomorphism with respect 
to the operation () of intersection. It is a straightforward matter to show that 
this correspondence is also an isomorphism with respect to the other elemen- 
tary operations on ideals, namely the operation of forming the sum, the prod- 
uct and the quotient of two ideals: 

c:. P*- (A, B)=(P*A, P*B). 

Ce. P*- (AB) = P*A-P*B. 

cs. P*-(%:B) = P*A: P*B. 

We shall refer to the relations c, ci, cz and cs as the tsomorphism relations. 
The question whether any one of these relations holds, arises quite generally 
whenever we deal with ideals in two rings P, P* such that P is a subring of P*. 
For a general treatment of the relationship between the ideals in two such 
rings, see Grell [1]. 

The ideal (no, m, + « + , Mn) is referred to as the irrelevant prime ideal in P. 
Any primary ideal whose associated prime ideal is the irrelevant prime is also 
referred to as irrelevant. Any prime homogeneous ideal, other than the ir- 
relevant prime, is of dimension s+1, s20, and defines an irreducible s-dimen- 
sional subvariety of V. Two homogeneous ideals which differ only by the 
irrelevant component define one and the same subvariety of V. 

2. Homogeneous and nonhomogeneous coordinates. A preference for one 
or the other type of coordinates is in part a matter of taste. However, it can 


1943} BIRATIONAL CORRESPONDENCES 495 


be claimed that in the study of properties of algebraic varieties, the use of 
homogeneous coordinates is indicated whenever one deals with properties in 
the large. For instance, the concept of a normal variety (Zariski [7, p. 285]) 
is defined essentially in terms of homogeneous coordinates. On the contrary, 
in questions pertaining to local properties it is preferable to use nonhomo- 
geneous coordinates. Thus, if our attention is focused on a given subvariety 
W of V and if, say, 70+0 on W, that is, if 79 does not beiong to the homo- 
geneous prime ideal by which W is defined (whence W does not lie in the 
hyperplane yo=0), then we may find it convenient to pass to the nonhomo- 
geneous coordinates £;=7;/no, 7=1, 2, - - - , m. With respect to these coordi- 
nates “W is at finite distance”—an expression that we shall use consistently. 
More generally, if 1=conot+cim+ on W, c:CK, and if, say 
Ca #0, then the quotients 7;//, i#a, may be equally well used as nonhomo- 
geneous coordinates £; of the general point of V. 

It should be understood that the field 2 of rational functions on V is 
the field K(é:, +--+, generated by the nonhomogeneous coordinates. 
The field K(mo, m, - ++, n) is a simple transcendental extension of 2. The 
field = consists of all quotients f(n)/g(n), where f and g are forms of like de- 
gree; in other words, 2 consists of all elements of the field K (m0, m, - - + , ma) 
which are homogeneous of degree zero (Zariski [7, p. 284]). 

We have also two coordinate rings: the ring K[no, M,°**, nn | of the homo- 
geneous coordinates, which we have denoted by P, and—for a given choice of 
the nonhomogeneous coordinates §;—the ring K[é:, ---, which we 
shall denote by o. In order to elicit the relationship between the ideals in 0 
and the homogeneous ideals in P, we assume for simplicity that £;=;/m» and 
we consider the ring =K(mo) [£:, &, - - - , Since mo is a transcendental 
with respect to 2, the two rings o and o* are in the same relationship to each 
other as the two rings P, P* of the preceding section. Therefore the corre- 
spondence between the ideals in 9 and their extended ideals in o* satisfies all 
the relations a—e (in which P and P* are naturally to be replaced by 0 and 0*). 
We shall denote by M* the class of o*-ideals which are extensions of 0-ideals. 

From the pair of rings, 0, 0* we pass to the pair of rings 0*, P. We have: 
o* =K(no)[m, 2, 2]. The polynomial ring K [yo] is a subring of P, and 
thus o* is a quotient ring of P, since K[o] is at any rate closed under multipli- 
cation('). There is therefore a (1, 1) correspondence between the ideals Y* 
in o* and those ideals & in P which are relatively prime to all elements of 
K[no], that is, which are such that A:a=4%, for all a in K[no]. The corre- 
spondence is again that of extension and contraction: A*=o*A, A=A*OP. 
Prime or primary ideals 2 go, respectively, into prime or primary ideals A*. 

(‘) For properties of quotient rings used in the text, see Grell [1] and Krull [2, p. 18]. We 
recall that a quotient ring is defined as follows: if R is an integral domain and if S is a subset of 
R which is closed under multiplication, then the quotients a/8, a, 8ER,8ES, form a ring. This 


is the quotient ring Rs. A special important case is the one in which S is the set-theoretic com- 
plement of a prime ideal p in R, that is, S=R—p. In this case one writes Ry instead of Rs. 


496 OSCAR ZARISKI [May 


The isomorphism relations c, ¢;, C2, cs of the preceding section, with P* re- 
placed by o*, continue to hold(*). These are properties which hold quite gen- 
erally for quotient rings. 

It is immediately seen that an o*-ideal * is in the class M* if and only if 
the corresponding ideal & in P is homogeneous. It is clear that a homogeneous 
ideal is always relatively prime to any polynomial f(m0) if that polynomial 
contains a constant term which is different from zero. Hence, for a homogene- 
ous ideal & to be relatively prime to each element of K[no], it is necessary and 
sufficient that it be relatively prime to 4. An equivalent condition is that 
no prime ideal of U be a divisor of (no). We have therefore a (1, 1) correspondence 
between the homogeneous ideals in P which satisfy this last mentioned condition 
and the ideals in 0. This conclusion corresponds to the obvious geometrical 
fact that by passing to nonhomogeneous coordinates 7;/no we lose track of 
all the subvarieties of V which “are at infinity,” that is, which lie in the 
hyperplane yo=0. 

Concretely, the relationship between two corresponding ideals a and & 
in o and P, respectively, is as follows: a form f(mo0, m, «++, 2) belongs to Y 
if and only if f(1, &, -- +, &) belongs to a; a polynomial $(é1, &, +--+, &,) 
belongs to a, if and only if there exists in & a form f(mo, m, «+--+, 7.) such 
that f(1, £1, £,) » 

3. The center of a valuation. Let W be an irreducible subvariety of V. If 
£1, +, are nonhomogeneous coordinates with respect to which W is 
at finite distance, then W is given by a prime ideal p in the ring o. By the 
quotient ring Q(W) of W we mean the quotient ring 0,. If B is the homogene- 
ous prime ideal which corresponds to W in the ring P, then it is easily seen 
that Q(W) consists of all quotients f(m)/g(n), where f and g are forms of like 
degree in 40, m,***» %m and where g(n)#0($). This shows, incidentally, 
that Q(W) is independent of the choice of the nonhomogeneous coordinates. 

The relationship between the ideals in o and the ideals in Q(W) is the one 
described in the preceding section for general quotient rings. The (1, 1) corre- 
spondence is now between the ideals in Q(W) and those ideals in 9 which are 
relatively prime to each element(*) of o—p. An ideal in 0 satisfies this condi- 
tion if and only if each prime ideal of this ideal is a multiple of p. This shows 
that by passing from the ring 0 (or from the ring P) to the quotient ring Q(W) 
we lose all irreducible subvarieties of V which do not contain W. For this 
reason we may regard the ideal theory of Q(W) as that pertaining to the 
“neighborhood” of W. An important property of the quotient ring 0) is the 


(2) However, it is to be pointed out that the set of ideals & which correspond to ideals A* 
in the above correspondence is not in general closed under the operations of multiplication and 
addition of ideals. It is closed under the operations of intersection and quotient formation. 

(*) Quite generally, given a quotient ring Rs (see footnote 1), there is a (1, 1) correspond- 
ence between the ideals in Rg and the ideals in R which are relatively prime to each element of S. 


1943] BIRATIONAL CORRESPONDENCES ; 497 


following: the non-units in 0, form an ideal(*). This is the prime ideal in 0, 
which corresponds to (that is, is the extension of) the prime ideal p in 0. 

Let v be a non-trivial valuation of the field 2 over K (v(c) =0 if cE K and 
c#0; v(w) for some w in 2). We consider linear forms / in 0, Ma» 
with coefficients in K, such that v(n;/l) 20, i=0, 1, - - - , . If lo is one such 
form, we consider the homogeneous ideal $ generated by the forms f(n) hav- 
ing the following property: if f(m) is of degree m, then v(f//j') >0. The ideal $ 
is independent of Jo, since if J; is another linear form /, then v(J,/l,) =0. 
Moreover, $ is obviously a prime ideal, different from the irrelevant prime. 
It is also different from the zero ideal, since v is a non-trivial valuation. Con- 
sequently $ defines an irreducible proper subvariety W of V, of dimension 
at least 0. This subvariety W we call the center of the valuation v (on V). 

It is clear that any linear form / for which v(n;/1) 20, i=0, 1, -- +, m, is 
such that 1~0 on W. Conversely, if / is a linear form in the n’s and if 1#0 
on W, then we must have: v(//1)) =0, whence v(n;//) =v(n;i/lo) 20. Thus the 
linear forms / which played an auxiliary role in our definition of the center W 
of v, turn out simply to be those forms which do not vanish on that center. 

In terms of nonhomogeneous coordinates the center W is obtained as fol- 
lows. Let us consider the nonhomogeneous coordinates £;=7;/no. Should the 
center W be at finite distance with respect to these coordinates, we must have 
on W. But then, by the remark just made, v(ni/no) 20,7=0,1,---,m, 
and the entire coordinate ring 0 must be contained in the valuation ring R, of v. 
Conversely, if oC R,, then reversing the above reasoning we see immediately 
that the center W of v is at finite distance with respect to the nonhomogene- 
ous coordinates &;. If f(0, m, - - - , mn) is a form of degree v and if f=0 on W, 
then >0, whence v(f(1, &, - - , >O. Conversely, it is seen immedi- 
ately that every polynomial in &, &, - + -, &, which has positive value in v 
arises from a form in the 7’s which vanishes on W. Hence the elements of 0 
which have positive value in the given valuation v form a prime ideal » in 0, 
and the center of v is the irreducible subvariety of V defined by the ideal ». This 
conclusion holds for any choice of the nonhomogeneous coordinates, provided 
the corresponding coordinate ring is contained in the valuation ring R,. 

The following characterizations of the center of a valuation are useful in 


applications: 


THEOREM 3. An irreducible subvariety W of V ts the center of a valuation v 
af and only if either one of the following conditions is satisfied: (1) Q(W) CR, 
and the non-units of Q(W) are non-units of R.; (2) Q(W)CR, and W is the 
maximal subvariety of V whose quotient ring is contained in R,. 


Proof. Suppose that W is the center of v. If f(n)/g(n) €Q(W), where f and g 


(4) Chain theorem rings with the property that their non-units form an ideal have been 
called by Krull “Stellenringen” (see Krull [3]). We propose the translation: “local rings.” The 
quotient ring of any irreducible subvariety of V is a local ring. 


498 OSCAR ZARISKI [May 


are forms of degree m, then g(n)#0 on W. Therefore v(g//j')=0, while 
v(f/Iy) 20, and consequently f()/g(n)ER,. Moreover, if f/g is a non-unit 
in Q(W), then f=0 on W. Hence v(f//t') >0, and consequently v(f/g) >0, that 
is, f/g is also a non-unit in R,. This proves that condition (1) is necessary. 
To show the necessity and sufficiency of condition (2), let W: be another irre- 
ducible subvariety of V with the property: Q(W:)CR,. Let J» be a linear form 
in 71, Which does not vanish on W;. Then 
whence v(n;/lo) 20, +=0, 1,-- +, m. This shows that on W. Let now 
f(n) be any form in the 7’s which vanishes on W. If f is of degree m, then 
>0, and consequently J} /f€R,. Since, by hypothesis, Q(W1)CR,, we. 
have a fortiori, I/f€Q(W:), whence f=0 on W,. Thus we find that “f=0 
on W” implies “f =0 on W;.” Therefore W,C W, and this proves our assertion. 

Now it follows immediately that condition (1) is also sufficient. For, if Wi 
is any proper subvariety of the center W of v, then(*) Q(W:1) CQ(W) and there 
exist non-units in Q(W,) which are units in Q(W) and which are therefore 
also units in R,. This completes the proof of the theorem. 

We shall conclude this section with two lemmas which we shall have oc- 
casion to use in the sequel. 


Lema 1. If W and W, are irreducible subvarieties of V, then WiCW if 
and only if Q(Wi1)CO(W), and(*) WiCW if and only if Q(W1) CQ(W). 


The proof is straightforward. If WiC W and if W, is at finite distance with 
respect to the nonhomogeneous coordinates £;, then also W is at finite dis- 


tance, and the corresponding prime ideals p; and p are such that p;> p. Hence 
Oy, op. If piDp, and if a is an element of p;, not in p, then 1/a€o,, but 
1/a€o,,, whence 0),Co,. Conversely, assume that 0(Wi) CQ(W). If $B and 
8, are the homogeneous ideals corresponding respectively to W and W,, let 
g(n) be an arbitrary form such that g¥0 on W,. Let f(n) be a form of the 
same degree as g, such that f¥0 on W. Then f/g€Q(W:), and hence 
f/zg€Q(W). This implies g¥0 on W, in view of the assumption that f~0 
on W. Hence if g#0 on W, we must also have: g¥0 on W. This shows that 
¥,>, whence W,CW, as asserted. 


Lemma 2. If v and v are two valuations of =/K, with centers W and W,, 
respectively, and if v is composite with 1, then WCW. 


Proof. A valuation v is composite with another valuation 1, if v is ob- 
tained by combining 2 with a valuation v’ of the residue field 2; of 1. The 
manner in which v and v’ are to be combined is best described in terms of 
the homomorphic mapping of 2 upon the residue field of the valuation (to- 
gether with the symbol ©), a mapping which is determined by the valuation 


(®) See the lemma which follows immediately. 
(*) We use the symbol C_ only for proper subsets. 


1943] BIRATIONAL CORRESPONDENCES 499 


and which in its turn determines the valuation uniquely. Let 7; be the homo- 
morphic mapping of 2 upon (2:1, ©) determined by 1, and let r’ be the 
homomorphic mapping of 2; upon the residue field Z/ (and the symbol ) 
of v’. Then » is the valuation of 2 determined by the homomorphic mapping 
T=77' of onto (Zi, ©). For further details, see Krull [2, p. 112]. 

Now let f(y) be a form which vanishes on W, and let g(n) be a form of the 
same degree as f(y) such that g~0 on W and on W,. The quotient {=f/g isa 
non-unit of Q(W;). Hence, by Theorem 3, ¢ is also a non-unit in R,,. Conse- 
quently 7:(¢) =0, and therefore also r(¢) =0. Hence ¢ is a non-unit in R,, and, 
in view of our assumption that g is mot zero on W, this is only possible if f=0 
on W, q.e.d. 

4. Existence theorems for valuations with a preassigned center. If R, is 
the valuation ring of a given valuation v of 2/K and if $, denotes the prime 
ideal of non-units of R,, then by the dimension of v is meant the degree of 
transcendency of the residue field R,/$, (over K). Let W be the center of v 
on V and let $ denote the quotient ring Q(W). By Theorem 3 we have: 
$.\3 =m, where m is the ideal of non-units in $. Hence $/m is a subring 
of R,/,, and therefore the dimension of W is not greater than the dimension 
of v. If v is of dimension r—1, it is called a divisor. A divisor is of the first or of 
the second kind with respect to V, according as its center on V is of dimension 
r—1 or less than r—1. 


THEOREM 4. Given an s-dimensional irreducible subvariety W of V, there 
exist valuations of center W, of any dimension p, sSpSr-—1. 


Proof. We consider first two special cases: (a) s=r—1; (b) s<r—1, 
p=r-1. 

Case {a) (s=r—1). Let $* be the integral closure of § in 2. The (r—1)- 
dimensional prime ideal m of & may split in $* into several prime ideals 
m;*, ms*, ---, m,*, all of the same dimension r—1. It is well known that the 
quotient rings 3,3 are valuation rings of divisors 1, v2, - - - , v». The center 
of each divisor v; is Our preassigned W, and in this fashion all the divisors of 
center W are obtained. — 

Case (b) (s<r—1, p=r—1). Assuming that W is at finite distance with 
respect to the nonhomogeneous coordinates &£;, let 9 denote, as usual, the ring 
of these coordinates, and let p be the prime ideal of W in 0. Let wi, we, - + + , Wm 
be a base of the ideal p. We select one element among these m elements w; and 
we denote it by w. We pass to the ring 0’ =0 [w1/w, we/w, - - + , Wm/w], and we 
first prove the following lemma: 


LeEmMa 3. For at least one mode of selecting the element w among the elements 
We, , Wm, the following relation will be satisfied: 0’ -p(\o=p. 


Proof of the lemma. Since w;=(w;/w)-w€o’-w, it follows that the ideal 
o’-p coincides with the principal ideal o’-w. Now let us suppose that 


500 OSCAR ZARISKI ° (May 


o’-w/\o~p, and let us see what restriction, if any, this assumption im- 
poses on the element w. Let { be an element of 0, which belongs to 0’-w 
but not to p. We will have then: [=w-f(wi/w, we/w, +--+, Wm/w), where 
f(z) =f (21, 22, » Sm) E021, 22, , Sm). If v is the degree of f, the above ex- 
pression for leads to a relation of the form: {-w’-!=@(q1, we, ,@m), where 
¢ is a form of degree v, with coefficients in 0. This relation implies that ¢-w’-* 
is in p’. Since {4 0(p), we conclude that p’:w’—! is a proper divisor of p. 

If our lemma is false, then for each element w;, i=1, 2,-+-, m, there 
must exist an integer such that p. Ifo =maximum (7, v2, -, Ym), 
then we will have p’:wf~'Dp, i=1, 2,--+, m. Hence we have also 
p’t?: pew '>p, for any integer p. Now if g=(¢—2)-m-+1, then it is clear 
that - - pews ')=ps, if p=q—o+1. Therefore the quotient 
p?t!: is the intersection of the ideals p*t!: 2,---, m. But 
each of these m ideals is, by hypothesis, a proper divisor of p. Hence also 
p? is a proper divisor, and this is impossible since(’) p*+!: p?=p. Our 
lemma is thus proved. 

We therefore may assume that o0’- p/\o=p. This implies at any rate that 
the ideal 0’-w is not the unit ideal, whence its minimal prime ideals are all 
(r —1)-dimensional. The relation 0’ - p(/\o = p also implies, and is in fact equiv- 
alent to, the assertion that at least one minimal prime ideal of 0’-w must con- 
tract to p. Let p’ be such a minimal prime. 

Now let V’ be the projective model whose general point (in nonhomogene- 
ous coordinates) is(*) (£1, - ++, En, w1/w, We/w, Wm/w), so that o’ is 
the ring of the nonhomogeneous coordinates of the general point of V’. Let 
W’ be the (r—1)-dimensional subvariety of V’ which is defined by the prime 
ideal p’. By the preceding case (a) there exists a (r —1)-dimensional valuation 
whose center on V’ is W’. Since p’(\o =p, it follows that the center of v on V 
is W. This establishes our theorem in the case under consideration. 

To prove the theorem in the general case, s<r—1, sSp<r-—1, we shall 
keep s and p fixed and we shall proceed by induction with respect to r, since, 
by the case (b), the theorem is true if r=p+1. We consider an (r —1)-dimen- 
sional irreducible subvariety W, of V which contains W and we denote by 
a divisor of center W;. The residue field 2;* of 2; is a finite algebraic extension 
of the field 2, of rational functions on W;. By our induction there exists a 
valuation v’ of 2, of dimension p, whose center on W, is W. This valuation v’ 
has at least one extension v* in 2*. Compounding 2; with v* we get a com- 
posite valuation v of 2, of dimension p, whose center is W. This completes the 


proof of our theorem. 


(7) Let We have then: On the other hand whence 
ps pet, Consequently pe, From this it follows (see Krull [2, p. 36]) that 
% and p have the same radical. Since Ap, it follows that A=p. 

(8) Comparison with section 11 will show that our V’ is the transform of V by a monoidal 
transformation of center W. 


1943] BIRATIONAL CORRESPONDENCES 501 


The preceding proof does not give an adequate idea of the totality of all 
valuations having a preassigned center. The valuations obtained in the course 
of the proof are special in the sense that if their dimension is p then the rank 
of their value group is r—p. The following theorem gives more information 
about the arbitrary elements which can be assigned in the construction of 
valuations with a given center(*): 


THEOREM 5. Given an arbitrary descending chain() Wep2Wid ---DWea 
of irreducible subvarieties of V and given any set of integers po, pt, * °°» Po-i 
such that r-12po>pi> >pe—-1, pizdimension of Wi, there exists a se- 
quence of valuations vo, 11, , such that: (1) v; is of dimension p;, of rank 
4+1, and its center is W,; (2) 0; ts compounded with v;-1. 


We first prove the theorem in the following two special cases: (a) o=1, 
po=s=dimension of Wo; (b) ¢=1, po>s. 

(a) po=s=dimension In this case we have to prove the exist- 
ence of a rank 1 valuation, of dimension s, whose center is a given s-dimen- 
sional irreducible subvariety Wy of V. We shall use nonhomogeneous 
coordinates £, £,---, & with respect to which W, is at finite distance, 
so that Wo is given by a prime ideal p in the ring 0 of these coordinates. 
We then adjoin to the ground field s elements of 9 which are algebraically 
independent on W, that is, algebraically independent mod p. In this fashion 
we achieve a reduction to the case s =0, so that we may assume that Wy isa 
point, say A. It is also permissible to assume("!) that £41, & +42, °--°, & are 


(*) Which ordered groups can be preassigned as value groups for valuations of fields of 
algebraic functions, is a question which has been solved completely by S. MacLane and O. F. G, 
Schilling in their paper [4]. 

(°) Note that we do not assume that the chain is strictly descending, that is, that each W; 
is a proper subvariety of W;-1. 

(2) In the case of infinite ground fields, or of ground fields with “sufficiently many” ele- 
ments, this assumption can always be realized, in view of the usual proof of Emmy Noether’s 
normalization theorem, by first subjecting the nonhomogeneous coordinates toa 
linear transformation with “non-special” coefficients in K. In the case of finite ground fields 
this is no longer true. However, it will be proved in II.2 that there always exists in the ring 
Klé:, &, a set of r algebraically independent elements {2, such that the 
ring is integrally dependent on the ring K[f:, f2,°-+ +, f+]. We may then simply include the 
elements {; among the elements &, which does not change the ring of nonhomogeneous co- 
ordinates, and we may then proceed as in the text. 

We could also proceed in the following fashion. We first pass to the field K’ which is gen- 
erated over K by the coefficients of the linear transformation on the & mentioned above, and we 
consider an extension field £’ = K’ of Z. The field K’ may be assumed to be an algebraic exten- 
sion of K. We obtain a new variety V’ over K’, with the same general point (f, &° ++, &n) 
as V. The original subvariety Wo splits on V’ into at most a finite number of varieties, all of 
the same dimension as Wo. If Wé is one of them, the proof given in the text leads to a valuation 
v’ of 2’ of dimension s and rank 1, with center Wy . The valuation v of = induced by v’ will 
have center Wo, dimension s and rank 1. 


502 OSCAR ZARISKI . [May 


integrally dependent on K[é:, &, - - - , §&]. Let A’ be the projection(?) of the 
point A into the linear space of the r independent variables &, &, - - - , &. Let 
v’ be a z7ero-dimensional and rank 1 valuation of the field K(é1, - - -, &) 
whose cei .er in the above linear space is the point('*) A’. We denote, as usual, 
by R,- the valuation ring of v’, and by Aj, Ao, - - - , A, the other points of V, 
at finite distance and different from A, which project into A’. We can find an 
element w in 9 such that w=0 at A, at i=1,2,---,h. Let 


w™ + ai(&1, , +--+ + , &) = 0 


be the irreducible equation of integral dependence for w over the ring 
K [£1, £2, - - , €&]. Since w=0 at A, we must have a, =0 at A’. Hence dm is a 
non-unit in R,». We assert that R,: [w] is a proper ring (that is, is not a field). 
We prove this by showing that w ts a non-unit in this ring. For suppose that w 
is a unit in R,-[w]. Then we would have: 1=w-g(w), where g(w) is a poly- 
nomial with coefficients in R,-. Using the above equation of integral depend- 
ence and observing that the coefficients a; are polynomials, hence are elements 
of R,-, we can reduce the degree of g(w). We thus find a new relation of the 
form: 1 - where b;€R,-. Comparing this re- 
lation with the above relation of integral dependence, we conclude that 
o= —1/am, a contradiction, since @» is a non-unit in Ry. 

Since w is a non-unit in R,- [w], there exists at least one valuation v of = 
such that R,2R,: [w] and such that w is a non-unit(") in R,. Since v’ is of rank 
1, is a maximal subring of K(&1, &, - - - , &). Hence &, - , &) 
= R,,, and therefore the valuation v is an extension of v' and has the same rank 
and the same dimension as v’, that is, rank 1 and dimension 0. The center of v 
on V must be a point at finite distance (since £1, °&, ---, &CR»CR, and 
hence oC R,, for R, is integrally closed), and this point must project into the 
point A’. The center cannot be any of the points Ai, Ao, - - - , Aa, since w¥~0 
at A; and this implies that w is a unit in the quotient ring Q(A;), while, as 
we have just seen, w is a non-unit in R, (compare with Theorem 3). Hence 
the center of v is the point A, q.e.d. 

(b) Let now o=1, p>s. We refer to the case (b) of the proof of Theo- 
rem 4, where we identify the variety W with our present variety Wo. We 
had there an (ry —1)-dimensional prime ideal p’ in 0’ such that p’(\o= p. From 
the existence of an (r—1)-dimensional prime ideal in 0’ which contracts to p 
follows immediately the existence of prime ideals in 0’ of any dimension p, 


(2) By that we mean that A’ is the point which is defined by the contraction of the prime 
ideal in the polynomial ring K[é, °°, é,]. 

(3) The existence of v’ is proved in the joint paper by MacLane and Schilling [4]. 

(4) This is implied by the fundamental theorem on principal orders which states that an 
integrally closed integral domain (not a field) is the intersection of the valuation rings which 
contain it (see Krull [2, p. 111]). It is necessary only to observe that the integral closure of a 
proper ring is also a proper ring. For if a ring R is proper, it contains a non-unit a, and it is seen 
immediately that 1/a cannot be integrally dependent on R. 


- 


1943] BIRATIGNAL CORRESPONDENCES 503 


r—12pz2s, which contract to(*) p. Let p’’ be such a prime ideal and let 

W’”’ be the corresponding irreducible p-dimensional subvariety of V’. By the 

preceding case (a), there exists a valuation v, of rank 1 and of dimension p, 

whose center on V’ is W’’. Since p’’/\0=p, the center of v on V is W, q.e.d. 
To prove our theorem in the general case, we first prove this lemma: 


Lemna 4. If W and W, are irreducible subvarieties of V such that WC WiC V 
and if v, is a valuation of center W,, then there exists a valuation v of center W, 
which is compounded with 2. 


Proof. Let 2; be the residue field of the valuation v; and let 7; denote the 
homomorphic mapping of 2 onto (21, ©) defined by 2. Since @(W) CQ(Wi) 
(Lemma 1, I.3) and Q(Wi)CR,,, it follows that if we put $=Q(W), then 
71: ¥O 2. The elements of $ which are mapped into zero under 7; are non- 
units in Q(W,). Since W is a proper subvariety of Wi, there are non-units in 
Q(W) which are units in Q(Wi). Hence 13 is a proper ring. There exists then 
at least one valuation v’ of 2, such that R,-D7;- . Let ve denote the valua- 
tion obtained by compounding v with v’, and let W2 be the center of v2. By 
Lemma 2, I.3, we have W2CWi. Since R,» D171: 3, it follows that R,,D 3, 
whence, by Theorem 3 (1.3), we have WCW. If W2= W, then our lemma is 
proved (v =v2). If, however, W is a proper subvariety of W2, then we replace 1 
and W, by v2 and W2 and we repeat the above procedure. Since v2 is of smaller 
dimension than 2, this process cannot continue indefinitely, q.e.d. 

We now are in position to complete the proof of the theorem in the gen- 
eral case. Since by the special cases (a) and (b) treated above the theorem 
is true in the case ¢=1, we assume that the theorem is true for 7=h—1 and 
we proceed to prove the theorem for c=h. We can therefore assume the 
existence of the valuation vo, 11, - - - , ¥s-2 and we have only to prove the 
existence of v,_;. The valuation v,_2 is of dimension p,_2 and its center is W,_2. 
We have to prove the existence of a valuation v,_;, of dimension p,_1, which 
is compounded with v,_2, is of rank one higher than the rank of v,_2 and has 
center W,_,. For simplicity we shall denote W,_2, W.-1, po—2, Po-1 and 
by W:, W, p1, p and 2, respectively, so that we have now: 


W,2> W, pi > p 2 dimension W; p; 2 dimension W,. 


We first provide ourselves with a projective model V’ on which the center Wi 
of v; is exactly of dimension(") p;. We then consider, for auxiliary purposes, 


(5) Let a1, a2, * * * , a be s elements of 0 which are algebraically independent modulo p 
and let us take as new ground field the field az, If 0; = 0,0; =Ki-0’, 
0/ “p’=p;, then, over K,, pi is zero-dimensional, p/ is (r—1—s)-dimensional, and 
pi =y..Let p,’’ be any (p —s)-dimensional prime divisor of p{ , and let p,’’(\o0’ =p’’. Then 
p’’ is of dimension p, over K, anc. \o=p. 

(48) As nonhomogeueous coordinates of such a model V’ we may take any finite set of gen- 
erators of = which belong to R,, and such that p; of these generators have algebraically inde- 
pendent residues in the residue field of 1. 


504 OSCAR ZARISKI.- [May 


an arbitrary valuation v which is compounded with 2; and whose center on V 
is W (Lemma 4; if Wi=W, we put v=»,). Let W’ denote the center of v 
on V’, whence W’CWy (Lemma 2, I.3). We next select nonhomogeneous 
coordinates £1, , - - - , &, for the general point of V and nonhomogeneous co- 
ordinates {1, f2, : - - , {m for the general point of V’ in such a fashion that W 
and W’ be at finite distance with respect to these coordinates. Finally 
we denote by V* the projective model whose general point is (:, £, - +--+, &n, 
and we denote by 0, 0’ and o*, respectively, the corresponding 
rings K [¢], K[¢], K [é, ¢] of the nonhomogeneous coordinates. 

Let W* and W* be the centers on V* of the valuations v and 1, respec- 
tively. Since W and W’ are at finite distance, we have: oCR,, o’CR,, and, 
by a stronger reason, 0CR,,, 0’CR,,. Therefore o*CR,, o*CR,,, and conse- 
quently W* and W¥ are at finite distance. Let p, pi; p’, pi ; p*, pi* denote the 
prime ideal of W, Wi; W’, Wi; W*, W¥ in 0, 0’ and o* respectively. It is 
clear that: 


p*(\o =», pir (\o = 


Moreover W*CW3. The above relations show that any valuation of center W* 
on V* has W as center on V. We also point out that V* shares with V’ the 
property that the center of v, on that variety 1s exactly of dimension p. This 
follows immediately from the relation: p*/\o0’=pi. The variety V’ and the 
auxiliary valuation v have now served their purpose and will not be used any 


more. 

(1) Suppose first that W* is of dimension at most p. By the special case 
o=1, we can find a p-dimensional, rank 1 valuation of the field of rational 
functions on W*, having W* as center. This valuation has at least one ex- 
tension in the residue field of the valuation 1. Let ve be such an extension. 
Since the residue field of 1 is an algebraic extension of the field of rational 
functions on W7¥, it follows that also v2 is of rank 1 and dimension p. Com- 
pounding » with v2, we get a valuation v of 2, of dimension p, of rank one 
higher than 2. Its center on V* is W*, hence its center on V is W. This valua- 
tion v is the valuation v,.;, whose existence we have claimed in our the- 
orem. (2) Suppose now that W* is of dimension greater than p. Since p*/\0 =p 
and since p is of dimension at most p, it follows that we can find in o* a prime 
p-dimensional ideal which divides p* and which likewise contracts('’) to p. 
This prime ideal defines a p-dimensional irreducible subvariety Wi of V* 
which we can use with the same effect instead of W*, since the two essential 
conditions: (1) We CW%#, (2) every valuation of center has W as center 
on V, are still satisfied. But now W¢ has dimension p, and we have therefore 
the case (1) just considered. This completes the proof of Theorem 5. 


(?7) See footnote 15. 


1943] BIRATIONAL CORRESPONDENCES 


Part II. GENERAL THEORY OF BIRATIONAL CORRESPONDENCES 


1. Valuation-theoretic definition of a birational correspondence. Let V 
and V’ be two birationally equivalent r-dimensional irreducible algebraic 
varieties. The two varieties can be regarded as projective models of one and 
the same field 2 of algebraic functions('*), and if they are so regarded there 
arises a well defined correspondence between the irreducible subvarieties of V 
(of all possible dimension from 0 to r—1 inclusive) and the irreducible sub- 
varieties of V’. It is the birational correspondence between V ard V’, or the 
birational transformation of V into V’. This correspondence, which we shall 
denote by 7, is defined as follows(!*): 


DEFINITION 1. Two irreducible subvarieties W and W’ of V and V’ respec- 
tively (not necessarily of the same dimension) correspond to each other (in sym- 
bols: T(W) = W', T(W’) = W), tf there exists a valuation v of the field = such 
that the center of von V is W and the center of von V' is W’. 


Note that this definition retains its full meaning also when V and V’ are 
coincident varieties (as varieties in the projective space). In this case we deal 
with an automorphism + of 2 and we have a birational transformation of W 
into itself. It is only necessary to regard the two coincident varieties V and 
V’ as distinct projective models of 2, in the sense that the general point of 
V is &, &) and the general point of V’ is (r&, - - , TEn). 

From the results of I.3 and I.4 we deduce immediately the following prop- 
erties of a birational correspondence: 

A. Given WCV, there exists at least one W’'CV’ such that T(W)=W’ 
(Theorem 4, 1.4). 

B. If WOWiCV and Wi =T(W,), there exists a W' such that W’=T(W) 
and W'CW. In particular, if to W, there corresponds a point P’ on V', then P’ 
corresponds to each point of W,. 

A birational correspondence is, generally speaking, not a (1, 1) corre- 
spondence. There may very well exist varieties W on V such that T is not 
single-valued at W, that is, such that to W there correspond more than one 
subvariety of V’. Similarly for 7-1 and V’. These varieties W are exceptional 
in the sense that they lie on algebraic subvarieties of V (see Theorem 15, 
II.9). The analysis of these exceptions to the (1, 1) character of a birational 
correspondence is the main goal of our study. At this stage, however, we wish 


(38) The fields 2, 2’ of rational functions on V and V’, respectively, are isomorphic over K. 
When we say that V and V’ are projective models of one and the same field we imply that the 
fields =, =’ have been identified. The identification is determined to within an automorphism 
of =. When we speak of a birational correspondence we refer to a fixed identification of the 
two fields. 

(%) From now on trreducible subvarieties ot V shall be denoted by the letter W, with or 
without subscripts. Similarly W’, Wi, Wi and so on shall always denote irreducible subvarieties 


of V’. 


506 OSCAR ZARISKI ~ [May 


to give a very simple but important criterion for the uniqueness of T(W) when 
W is given: 


THEOREM 6. If 7(W)=W’ and if Q(W’)CQO(W), then W’ is the only sub- 
variety of V’ which corresponds to W. 


Proof. By hypothesis, there exists a valuation 1; whose center on V is W 
and whose center on V’ is W’. Let v be an arbitrary valuation of center W. 
We have R,2>Q(W), whence R,2Q(W’). Every non-unit of Q(W’) is a non- 
unit of R,, (Theorem 3, I.3), hence it is also a non-unit of Q(W). We conclude 
that every non-unit of Q(W’) is a non-unit in R,, and our theorem follows 
from Theorem 3. 

2. The birational correspondence between V and a derived normal model 
V. In our paper [7] we have proved that from any irreducible algebraic vari- 
ety V it is possible to pass to what we have called a derived normal model V of 
V. That proof dealt only with algebraically closed ground fields of character- 
istic zero. To extend the proof to arbitrary ground fields additional considera- 
tions are necessary. 

First of all we shall need an extension of the normalization theorem of 
Emmy Noether to finite ground fields. Given a finite integral domain 
K[é, &,---, £, | of degree of transcendency r over an infinite ground field 
K, the Noether normalization theorem states that there exist r linear combin- 
ations =) 2,--+, 7, with coefficients in K, which are alge- 
braically independent over K and which are such that &, &&,---, & are 
integrally dependent on &, &,---, &. This theorem, as it stands, is not 
generally true when K is a finite field. In this case we can still assert that ele- 
ments such as &/, &, - - -, can be found in the ring K[&, ---, &a], 
provided we drop the condition that these elements be linear in the &’s. The 
proof of this assertion, as given below, was communicated orally to me by 
Irvin Cohen. 

We shall first consider an homogeneous integral domain o* =K[no, m, 

- , Wn], of degree of transcendency r+1, that is, one whose generating ele- 
ments 7; are the homogeneous coordinates of the general point of an r-dimen- 
sional variety V. 

(a) If the ideal (m, m2, +, im 0* is irrelevant, then no is integrally 
dependent on m, 12,:*~-, %.. For the hypothesis implies that the point 
yo=1, =¥,=0 is not on V, and hence there must exist a form 
Yu Yn) Such that f(mo, M2) =O and f(1, 0,---, 0)#0. 
If p is the degree of f, the term 7§ must therefore occur in f(y), and this 
proves our assertion. 

(b) More generally, if the ideal Me+2, Mn) 18 irrelevant, then 
No, * » are integrally dependent on * » Nn Proof by induc- 
tion with respect to & (that is, with respect to the number k+1 of elements 
which do not occur in the set x41, 7e42, - By (a), m0 is integrally 


1943] BIRATIONAL CORRESPONDENCES 507 


dependent on m, 72, - - - , %n- Since every element of o* is integrally depend- 
ent on %2, the elements 41, M42, generate in the ring 
K [m, ”2, - * *, Mn] an ideal of the same dimension as that of the ideal gener- 
ated by them in o*, that is, they generate in K[m, m2, - - - , m2] an irrelevant 
ideal. By our induction it follows that m1, 72, - - - , m are integrally dependent 
ON * * » Mn, G.e.d. 

(c) If wo, w1, are forms in no, m, Mn, all of the same degree h, 
and if the ideal (wo, - , @,) ts irrelevant, then the n’s are integrally depend- 
ent on the w’s. For if n®, »®, - - - form a linear base for the forms of degree h 
in the 7’s and if we include the w’s in this base, then applying (b) to the ring 
K[n?, n®, - - - ], we conclude that 7}, ni, ---, a are integrally dependent 
on the w’s. 

(d) We obviously can select (in many ways) r+1 forms fo, &, ---, ¢,in 
o* such that the ideal (fo, 6, - --, ¢-) be irrelevant. We can then find ex- 
ponents o; such that the forms w; = {{ be of like degree. Then it follows, by 
(c), that the 7’s are integrally dependent on wo, w, - - - , w,. This completes 
the proof of the extended “normalization theorem” for homogeneous integral 
domains. 

From an homogeneous integral domain K[mo, m, ---, nn], of degree of 
transcendency r+1, we get an arbitrary integral domain ---, &], 
of degree of transcendency r, by putting £;=7;/o. We apply step (d) above, 
and we observe that of the r+1 forms ¢;, one, say {o, can be taken arbitrarily. 
If we put foo, then wo is a power of no, say wo= 7). It is then seen immedi- 
ately that the r elements £/ =w;/n}, i=1, 2, - - - , r, are polynomials in the 
é’s and that &, &, - - - , & are integrally dependent on &/, ---, &/. This 
completes the proof. 

Let (mo, m,* ++, %.) be the general point of V (the coordinates 7; are 
homogeneous(2°) and let P=K[no, m, ---, Let P be integral closure 
of P in its quotient field. We first need to establish in the most general case 
that P is a finite P-module. By the normalization theorem of Emmy 
Noether, let fo, {:,°--, § be r+1 algebraically independent elements 
in P such that every element of P is integrally dependent on K [fo, f1, - - -, |. 
This last ring is a polynomial ring and we shall denote it by R. Since the field 
>=K(mo, m,-**, Ma) is a finite algebraic extension of the quotient field 
of the polynomial ring R, it follows that the integral closure R of Rin > isa 
finite R-module. This result, for arbitrary ground fields, has been proved by 
F. K. Schmidt [5]. Since PC R, it follows that every element of P is integrally 
dependent on R. But since R is a finite R-module and is therefore a chain- 
theorem ring, it is well known that every element of = which is integrally 
dependent on R is also integrally dependent on R, that is, belongs to R. 
Hence PCR, that is, P= R. Thus P is a finite R-module, q.e.d. 


(2°) To avoid repetitions, we stipulate from now on that whenever the subscript in a set of 
coordinates begins with 0, the coordinates are homogeneous. 


508 OSCAR ZARISKI j [May 


Since P is a finite P-module, it is a finite integral domain over K. Let 
P=K [fo, &, +--+, &m]. As in the quoted paper [7] (see p. 290), we may 
assume also here that the f; are homogeneous elements. 

Then by exactly the same procedure as that carried out in our quoted 
paper we can show that if w¢*, wi*, - - - , w,* is a linear K-basis for all the 
homogeneous elements of P of a given degree 6, then for suitable integers 5 
it will be true that every homogeneous element in P whose degree is a multiple 
pd of 6, p>O0, is necessarily a form of degree p in w*,- ++, w,*. From 
that we concluded in the quoted paper that for such an integer 6 the ring 
P*=K [w#*, - - - , w.*] ts integrally closed in its quotient field, whence the 
variety V whose general point is (wo*, wi*, - - - , w,*) is normal. This variety V 
we termed a derived normal variety of V. It was pointed out to me by Irvin 
Cohen that the above conclusion fails to hold true if K is not maximally 
algebraic in the field = of rational functions on V, that is, if V is not absolutely 
irreducible (see [10, Lemma 4, p. 64]). For in this case the elements of 2 
which are algebraic over K but are not in K, that is, the homogeneous elements 
of degree zero, are certainly not in the ring P*. However, it is still true that 
P* contains all homogeneous integral quantities (that is, the homogeneous 
elements of the quotient field of P* which are integrally dependent of P*) 
of positive degree. Hence if a* is any integral quantity in the quotient field 
of P*, then the products a*w*, 7=0, 1,---, uw, belong to P*, since they 
are sums of homogeneous integral quantities of positive degree. It follows 
that the irrelevant ideal P* (w¢*, wi*, - - - , w,*) ts the conductor of the ring P* 
with respect to its integral closure in the quotient field of P*. This implies that 
the variety V is locally normal in the sense of Definition 3 given later on in 
this section (see also [7, Theorem 13, p. 286]). For our purpose a locally 
normal variety is just as effective as a normal variety. We shall continue to 
call V the derived normal model of V, it being understood that if V is not 
absolutely irreducible then V is only locally normal. It may be well to point 
out at this stage the self-evident fact that if V is not absolutely irreducible, 
the field 2 does not possess at all normal models over K. 

For the general theory of birational correspondences it is necessary to 
establish first some properties of the birational correspondence between a 
given model V and a derived normal model V of V. For it will follow from 
the properties of this birational correspondence that the properties of a bi- 
rational correspondence between any two models V, V’ can be readily de- 
duced from the properties of the birational correspondence between the 
derived normal models of V and V’. Therefore, there is no loss of generality 
if the theory is restricted to normal models. On the other hand, the emphasis 
on normal models is advantageous, both from a technical and a conceptual 
standpoint, since in the case of normal varieties the theory of birational cor- 
respondences is free from many accidental complications and irrelevant ex- 
ceptions which one often encounters on non-normal varieties. 


1943] BIRATIONAL CORRESPONDENCES 509 


Let us therefore consider the case in which one of the two birationally 
equivalent varieties V, V’ is a derived normal model of the other. Let, 
say, V’ be a derived normal model of V and let (mo, m,-+-, m,) and 
(nd, ni, Im) be the general points of V and of V’, respectively. Let 
be the degree of homogeneity of V’. The elements / form then a linear base 
for the elements of the field K(0, m, - - - , 22) which are integrally dependent 
On 0, * » ana which are homogeneous of degree h. Moreover, the ring 
P’=K[né, n{, -- -, 2 | is integrally closed (in its quotient field), or at any 
rate contains all homogeneous integers of positive degree. 

It will be convenient to use an auxiliary projective model V* defined as 
follows. Let née, ni*, > ++, n° be a linear base for the forms of degree h in 
No, 1, * * * » Mn, With coefficients in K. We take as V* the variety whose gen- 
eral point is (no, ni*, 


Lemma 5. The birational correspondence between V and V* ts (1, 1) without 
exceptions(?!). Any two corresponding irreducible subvarieties of V and V’ have 
the same dimension and the same quotient ring. 


Proof. Let W and W* be two corresponding irreducible subvarieties of V 
and V*, respectively. We assume that 49>+0 on W. Since the n*’s constitute 
a linear base for the forms of degree / in the ’s, we may assume that 7) is 
one of the 7*’s, say 7o=n0*. If v be a valuation of center W and W*, then 
v(n:/no) 20, since on W. From this it follows that 20, for 
j=0,1,--+,s, and hence (see I.3) on W*. Now let be any element 
of Q(W), say £=¢(.,)/¥(n), where ¢ and y are forms of like degree and where 
¥(n) #0 on W. We may assume that the common degree of ¢ and y is a multi- 
ple of h, say ph, since we can multiply both ¢@ and y by any power of 1 
without destroying the inequality: Y~0 on W. But if ¢ and y are of degree ph, 
then they can be expressed as forms of degree p in the n*’s: 6(n) =@*(n*), 
¥(n) =W*(n*). Since ¥(n)¥0 on W, we have Hence 
v(y*(n*) =0, and this shows that ¥*(n*) #0 on W*. Since =*(n*) /AV*(n*), 
we conclude that fGQ(W*). 

A quite similar argument shows that if [© Q(W*), then [© Q(W). Hence 
the quotient rings Q(W) and Q(W*) coincide, and from this our lemma fol- 
lows in view of Theorem 3 (I.3). 

The lemma shows that as far as the study of the birational correspondence 
between V and V’ is concerned, it is permissible to replace V by V*. Let us 
see therefore how V’ is related to V*. 

Every homogeneous element in K(mo, m, 9), of degree h,—and in 
particular each element »/—can be written as a quotient of two forms in 
the n’s whose degrees are multiples of k. Any such quotient is a quotient of 

(#2) When we say that a birational correspondence between two varieties V and V’ is (1, 1) 


without exceptions, we mean that it is (1, 1) as a correspondence between the irreducible sub- 
varieties of V and the irreducible subvarieties of V’. 


510 OSCAR ZARISKI [May 


two forms in the n*’s. Hence n/ ni*, - - , Conversely, each ele- 
ment 7*, being homogeneous of degree h, is a linear form in the n’’s. We there- 
fore conclude that the two fields K(no*, ni*, - - , n*) and K(nd, ni, -- 
coincide. 

The elements »*, which as elements of the field K(mo, m, are 
homogeneous of degree h, as elements of the field K(n#*, n*, - --, 7*) are 
to be regarded as homogeneous, of degree 1. The same remark applies to 
the elements n/. Moreover, the elements 7} constitute a linear base for the 
elements of the field K(n&*, n*, - - - , n*) which are homogeneous of degree 1 
and which are integrally dependent on ne, n*, - - - , n*. We conclude from 
all this that V’ ts also a derived normal variety of V*, of degree of homogeneity 1. 
Thus, while not every variety V possesses a derived normal variety of degree 
of homogeneity 1, we may nevertheless assume—and we do so assume—that 
we had originally 4=1; this assumption amounts to replacing V by V*. 

Now that we have h=1, it follows that 7’s are linear combinations of 
the n’’s, whence V is a projection of the normal model V’. Moreover, the ring 
P’=K[n’] is now either the integral closure of the ring P=K[n] in its quo- 
tient field or contains at any rate all homogeneous integral quantities of 
positive degree. 

Let now W and W’ be corresponding irreducible subvarieties of V and V’, 
respectively. We assume that y»>*0 on W and that n¢ =m. J assert that 
no *0 on W’. To see this we have only to show that v(n//n¢)20, 
4=0,1, --+,m, for at least one valuation of center W’. We take as v a valua- 
tion which has also W as center on V. We write the relation of integral de- 
pendence for n{ over P. It is of the form (see our paper [7, p. 286, equation 


(33) ]): 
n+ ai(n)n: + a,(n) = 0, 


where a;(n) is a form of degree j in mo, m, « - « , Ma. If we divide this equation 
by 7, we see that the quotients n/ /n¢ are integrally dependent on the quo- 
tients m/70, n2/no, » Since no~0 on W, these quotients are in R,, 
and this proves our assertion. 

Let $ and §’ be the prime homogeneous ideals of W and of W’ in the 
ring P and P’, respectively. From the fact that 7940 on W and n¢ (=) ¥0 
on W’ and from the very definition of the center of a valuation, it follows 
immediately that 2 = B’7\P. Since the elements of P’ are integrally depend- 
ent on P, there is only a finite number of prime ideals in P’ which contract 
to f. These ideals are all homogeneous(?*) and of the same dimension as $. 
We therefore reach the following conclusion: 

To each irreducible subvariety W' of V’ there corresponds a unique subvariety 
W to V, while to each irreducible subvariety W of V there corresponds a finite 


(22) The prime ideals 8’ which contract to § are the minimal primes of the extended ideal 
P’-$ and therefore are homogeneous, by Theorem 2 (I.1). 


1943] BIRATIONAL CORRESPONDENCES 511 


number of subvarieties W’ of V’. Two corresponding varieties W and W' have 
the same dimension. 

We now investigate the relationship between the quotient rings of two cor- 
responding varieties W and W’. We pass to the rings o=K[&, f, ---, &a] 
and o’=K[é/, &,---, ] of the nonhomogeneous coordinates 
and £/ =n/ /né, where no=n¢ . Here 0’ is the integral closure of 0. Let p be 
the prime o-ideal of W. Any W’ which corresponds to W will be at finite 
distance with respect to the coordinates £/ and will be given in 0’ by a prime 
ideal which contracts to p. Let p/, p?, - - -, p/ be the prime 0’-ideals which 
contract to p, and let Wi, Wi,---, W/ be the corresponding subvarieties 
of V’. Let 


$=Q(W) =o, Bi = QW!) = om, 


and let m and m/ denote the ideals of non-units in $ and $%/, respectively. 
We have $/D$ and & is integrally closed. Let $* denote the integral 
closure of Then $/ D> Do’ and we can consider the ideals m¥ = m/ 
The ideals m*, - --, m,* are distinct, since The 
ideals m* contract to one and the same ideal in 3, namely to m, since p{ (\o=p 
and m=$-p. The quotient ring of m* in $%* is contained in $/ since 
m* = m/ (\3*. On the other hand, the quotient ring $/ is contained in the 
quotient ring of m#*, since m#(\o’=p/. Hence the quotient ring of m* in the 
ring coincides with . 

The foregoing properties of $* can also be derived from the general theory 


of quotient rings (see I.2 (#)). It is only necessary to observe that $* coincides 
with the quotient ring os , where S=o—p. From this remark it follows im- 
mediately that the ideals m*, ms, - - - , m,* are the only prime ideals of 3* 
which contract to m. The connection between the quotient ring Q(W) and 
the v quotient rings Q(W/ ) is therefore fully established. Reassuming, we can 
now state the following theorem: 


THEOREM 7. The birational correspondence between an irreducible algebraic 
variety V and a derived normal variety V’ of V has the following properties: 

(A) Two corresponding subvarieties W and W’ of V and V’, respectively, 
have the same dimension, and we have: Q(W)CQ(W’). 

(B) Given W’, the correspending W is uniquely determined, while to a given 
W there corresponds a finite number of varieties W’. 

(C) If 3* denotes the integral closure of the quotient ring $=Q(W), and 
af mi*, ms, +--+, m* are the prime ideals in $* which contract to the ideal of 
non-units in then there are exactly v varieties Wi, Wi, -- +, which cor- 
respond to W, and for a suitable ordering of the indices we will have Q(W! ) = Sin:- 


Coro.iary 1. The birational correspondence between any two derived normal 
varieties of V is (1, 1) without exceptions, and corresponding subvarieties have 
the same quotient ring. 


| 


512 OSCAR ZARISKI [May 


Other corollaries follow from Theorem 7. We first give the following defini- 
tions: 


DEFINITION 2. If Q(W) is integrally closed, then V is said to be locally 
normal at W. 


DEFINITION 3. If V is locally normal at each of its irreducible subvarieties, 
then V is said to be a locally normal variety(**). 


A locally normal variety is characterized by the property that it is normal 
in the affine space for évery choice of the nonhomogeneous coordinates. For 
that it is necessary and sufficient (see our paper [7, Theorem 13, p. 286]) that 
the conductor of the ring P=K[no, m, - - - , 22] with respect to the integral 
closure of P be an irrelevant ideal. 


Coro.iary 2. Jf V is locally normal at W then to W there corresponds a 
unique subvariety W’ of the derived normal model V', and the quotient rings 


Q(W), Q(W’) coincide. 

Coro.iary 3. If V is locally normal, then the birational correspondence be- 
tween V and a derived normal variety V' of V is (1, 1) without exceptions, and 
the correspondence preserves quotient rings. 


CoroLuary 4. The irreducible subvarieties of V to which there corresponds 
more than one variety on V’ all lie on the subvariety C of V which is defined by 
the conductor © of the ring P=K[no, m, +--+, 2} with respect to the integral 
closure of P. Outside of C, the birational correspondence between V and V’ is 
(1, 1) and it preserves quotient rings. 


3. The fundamental elements of a birational correspondence. We con- 
sider a birational correspondence T between two locally normal varieties V 
and V’ (the general case is discussed briefly at the end of this section). Let W 
be an irreducible subvariety of V. 


DEFINITION 4. We say that W is (1) regular, (2) irregular, or (3) funda- 
mental for T if there exists a W’ on V' such that W'=T(W) and, respectively, 
(1) O(W) =Q(W’), (2) Q(W)DQ(W’) or (3) 


The following theorem is merely a statement of some properties of regular, 
irregular and fundamental varieties which follow directly from the definition 
and from Theorem 6 (II.1) and which we shall use very frequently: 


THEOREM 8. 
(A) If W is regular or irregular, then to W there corresponds a unique W' 


on V’. 


(#8) For V to be locally normal it is sufficient that V be locally normal at each of its points. 
For if W is any irreducible subvariety of V and if P is any point of W, then Q(W) is also the 
quotient ring of a prime ideal in Q(P). If Q(P) is integrally closed, also Q(W) is integrally closed. 


° 


1943] BIRATIONAL CORRESPONDENCES $13 


(B) If Wis regular for T and if W'’ =T(W), then W’ is of the same dimen- 
ston as W and is regular for T-*. 

(C) If Wis irregular and W’=T(W), then W’ is fundamental for T—', and 
the dimension of W’ is less than or equal to the dimension of W. 

(D) If Wis fundamental, then Q(W)DQ(W’) for any W’ which corresponds 
to W. 

(E) A necessary and sufficient condition that W not be fundamental is that 
for a suitable choice of the nonhomogeneous coordinates &{, &{, +++, &m of the 
general point of V’, the ring 0’ of these coordinates be contained in Q(W). 


The proof of (E) is immediate. For assume that 0’CQ(W) and let p’ be 
the prime o’-ideal which is the contraction of the ideal of non-units of Q(W). 
Then if W’ is the irreducible subvariety of V’ which is defined by p’, then 
Q(W’)<O(W) and by Theorem 3 every valuation of center W on V has 
center W’ on V’. 

The sets of regular, irregular and fundamental varieties are mutually ex- 
clusive. We now give the following definition: 


DEFINITION 5. A birational correspondence T is regular, if every W ts regu- 
lar for T. 


It is clear that if T is regular, then also 7~-! is regular (Theorem 8 (B)) 
A regular birational correspondence is a (1, 1) correspondence, without ex- 
ceptions, and it preserves quotient rings. We have encountered examples of 
regular birational correspondences in the preceding section (Lemma 5; Theo- 
rem 7, Corollaries 1 and 3). 

The next theorem shows that Theorem 8 (A) expresses a characteristic 
property of a non-fundamertal W. 


THEOREM 9. If to W there corresponds a unique subvariety W' of V’, then W 
4s not fundamental for T (hence is either regular or irregular). 


Proof. Since V is locally normal, the quotient ring Q(W) is integrally 
closed. By the fundamental theorem on principal orders(!*), Q(W) is the in- 
tersection of the valuation rings which contain Q(W). Let R., be one of these 
valuation rings. Since R,,2Q(W), it follows that the center of », on V is 
either W or a subvariety W: of V which properly contains W (Theorem 3, 
I.3), In the second case there exists a valuation v which is compounded with 1, 
and has center W (Lemma 4, I.4). Since RuC R», we can omit R,, from the 
set of valuation rings which contain Q(W), without affecting the intersection 
of these rings. Hence Q(W) is also the intersection of the valuation rings which 
belong to valuations of center W. Since, by hypothesis, all valuations of center 
W on V have the same center W’ on V’, it follows that the corresponding 
valuation rings al! contain Q(W’). Consequently Q(W)>Q(W’), as was as- 
serted. 


514 OSCAR ZARISKI — [May 


CoROLLARY. The dimension of a fundamental variety W cannot exceed r—2. 


For if W has dimension r—1, then Q(W) is itself a valuation ring, namely 
the valuation ring of a divisor v. Therefore W is the center of only one valua- 
tion, namely of the divisor v. 


THEOREM 10. If W is fundamental, then to W there correspond on V’ in- 
finitely many varieties W’. 


Proof. We shall prove that if to W there corresponds on V’ only a finite 
number of varieties, then W is not fundamental. 

Let Wi, Wi, - +--+, Wi be the irreducible subvarieties of V’ which corre- 
spond to W. If v is any valuation of center W, then R, must contain at least 
one of the # quotient rings Q(W/ ). Hence R, contains the intersection of these 
quotient rings. Since Q(W) is the intersection of all R,, it follows that Q(W) 
contains the intersection of the h quotient rings Q(W/?). 

We can find a form ¢(né, +--+, mm), of a sufficiently high degree 
such that ¢#0 on W/,i=1, 2, ---,h. We pass from V’ to the variety V7 
whose general point is defined by a linear K-basis of the forms of degree v 
in 70,7, °° By Lemma 5, V’ and Vj are in regular birational corre- 
spondence, hence we may replace in our proof V’ by Vi. We may therefore 
assume that @ is one of the elements n/, say ¢=¢. From the fact that 
no #0 on W/, i=1, 2,---+, h, it follows that the ring 0’ of the nonhomo- 
geneous coordinates £/ =n//n¢ is contained in each quotient ring Q(W/). 
Since the intersection of the rings Q(W/!) is contained in Q(W), our theorem 
follows from Theorem 8 (E). 

We shall now discuss briefly the general case in which V and V’ are not 
locally normal. Let V and V’ be derived normal varieties of V and V’ re- 
spectively. Let W be an irreducible subvariety of V and let Wi, We, ---, Wa 
be the irreducible varieties on V which correspond to W (Theorem 7 (B)). 
We shall denote by T the birational correspondence between V and V’. 


DEFINITION 6. The variety W is regular for T, if each W;,i=1, 2,---,h, 
is regular for T; W is fundamental for T, if at least one of the varieties W; is 
fundamental for T; W is irregular for T, if it is neither regular nor fundamental, 
that is, if no W; is fundamental for T and if at least one W; is irregular for T. 


Of the theorems proved in this section for locally normal varieties, Theo- 
rems 9 and 10 continue to hold in the general case. The validity of Theorem 10 
is obvious. As to Theorem 9, the proof is as follows. If T is single-valued 
at W, say T(W)=W’, then any irreducible subvariety of V’ which corre- 
sponds to W; under T (¢=1, 2, - - - , k) must be among the irreducible sub- 
varieties of V’ which correspond to W’ in the birational correspondence 
between V’ and V’. Hence to each W; there can correspond on V’ only a 
finite number of varieties. Therefore no W; is fundamental (Theorem 10) 
for T, and therefore, by definition, W is not fundamental for T. 


¢ 


1943] BIRATIONAL CORRESPONDENCES $15 


In particular, if 7(W)=W’ and if Q(W)2>Q(W’), then W is not funda- 
mental. This follows from Theorem 6, II.1. 

On the other hand, other results established for locally normal varieties 
do not generalize to varieties which are not locally normal. For instance, 
parts (A) and (B) of Theorem 8 cease to be true in the general case. Also the 
defining property of a non-fundamental variety used in Definition 4 ceases to 
be a property of non-fundamental varieties in the general case, that is, if W 
is not fundamental that does not mean that there must exist a W’ such that 
W’=T(W) and Q(W)2Q(W’). Also the condition stated in Theorem 8 (E) 
is sufficient, but no longer necessary. 

Note that according to Definition 6 the birational correspondence be- 
tween a variety V and derived normal variety of V is free fron: fundamental 
elements on either variety. 

We shall agree to use Definition 5 of regular birational correspondences 
also in the case of varieties which are not locally normal. 

4. A question of terminology. At this stage it becomes necessary to point © 
out and to discuss the difference between our terminology and the terminol- 
ogy used heretofore in the literature. This difference concerns the meaning of 
the term “fundamental” and our use of the new term “irregular.” 

In the case of algebraic surfaces it is the sense of the old terminology that 
both points and curves can be fundamental: a point P is fundamental if it is 
transformed into a curve I’, and any such curve I’, which is then the trans- 
form of a point, is “fundamental.” As far as the notion of a fundamental 
point is concerned this is in agreement with our terminology, from Theorem 8 
(A) and Theorem 10. However, by Theorem 9, corollary, a curve on an alge- 
braic surface can never be fundamental in our sense. The “fundamental” 
curves in the sense of the old terminology are irregular curves in our sense. 

The reasons for our terminology—or better—the inadequacy of the old 
terminology(?*) become apparent in the case of higher varieties. Let us con- 
sider, for instance, a birational correspondence T between two 3-dimensional 
varieties V and V’. Again, according to the old terminology we may have 
“fundamental” loci of all dimensions from 0 to 2. As far as fundamental 
points and “fundamental” surfaces are concerned, the situation is the same 
as in the case of algebraic surfaces: there is complete agreement on funda- 
mental points, while according to our terminology there are definitely no 
“fundamental” surfaces, but only irregular surfaces. It is, however, the use 
of the term “fundamental curve” that brings out some significant facts. 

I can find no clear-cut definition of a fundamental curve in the literature. 
This much is certain: if a curve T' is such that 7(T) is a surface, or if I’ cor- 
responds to each point of another curve, then in the old terminology (and 


(4) The best justification for our terminology is its own logical consistency. We call funda- 
mental a variety W if and only if the birational correspondence T is infinitely many-valued at W. 


Otherwise W is either regular or irregular. 


516 OSCAR ZARISKI [May 


also in our terminology) [ is fundamental (respectively, of the “first” or of 
the “second kind”). Suppose, however, that the transform of IT is a single 
point. I am not certain whether or not such a curve is fundamental in the 
sense of the old terminology. If it is, then the terminology is confusing, since 
we are dealing here with a curve at which the birational transformation is 
single-valued. If it is not, then the terminology is inconsistent, in view of the 
use of the term “fundamental” surface, since in both cases we are dealing with 
a W such that 7(W) is unique and is of lower dimension than W. 

It is quite possible that in the old terminology no special name has ever 
been assigned to a curve I such that 7(I) is a single point. If that is the 
case, then this is probably due to the fact that, as a rule, only nonsingular 
models have been considered in the literature. If the three-dimensional varie- 
ties V and V’ are nonsingular, then a curve I which is transformed into a 
point necessarily lies on a surface which is transformed into a curve (see 
II.10, Theorem 17, corollary). Thus, such a curve I always lies on a “funda- 
mental” surface, and there seemed to be no compelling reason for giving these 
curves a special name. However, in the case of singular models it may very 
well happen that a curve I whose dimension is lowered by the birational trans- 
formation 7 and at which T is single-valued (these two properties imply that 
W is irregular; see Theorem 8 (B) and Theorem 9) does not lie on any sur- 
face having the same properties (compare with Theorem 17, II.10). Some 
term for such a curve is necessary, and the term “fundamental” we reject for 
reasons given above. ; 

5. The join of two birationally equivalent varieties. Let (mo, m, - 
and (n¢, ni, +++, %m) be the general points, respectively, of V and of V’, 
where V and V’ are our two birationally equivalent varieties. Since the quo- 
tients n/ /né are rational functions of the quotients 7;/mo, the n’’s are propor- 
tional to forms of like degree in the n’s: 


(4) 


DEFINITION 7. The irreducible algebraic variety V* whose general point has 
the (n+1)(m+1) products nip; as homogeneous coordinates is called the join of 
V and V’. 


We shall denote the products 7.:¢; by 7;; and the quotients 7;/mo0, n/ /ne 
and .9i;/noo by &:, &/ and &;, respectively. We have then: 
(S) fio = &, fos = fi; = i,j #0, 


and from these relations it follows that V* is birationaliy equivalent to V 
(and to V’). Moreover, if we take as nonhomogeneous coordinates of the gen- 
eral point of V* the quotients of the 7;;’s by a fixed 7:;, say by 700, then the 
ring of these coordinates is, by (5), the join of the two rings of nonhomogene- 


j 
= 
‘ 


1943] BIRATIONAL CORRESPONDENCES 


ous coordinates relative to V and V’. In symbols: if 
o = K[é:, o = K[é/, &/,--- |, 
o* K [E10, £20, 


then o* = (0, 0’). 


THEOREM 11. In the birational correspondence T* between V and V* there 
corresponds to any irreducible subvariety W* of V* a unique subvariety W of V. 
If a given W on V is noi fundamental for T, then it is regular for T*. Similarly 
for V’ and V*. 


Proof. We may assume that W* is at finite distance with respect to the 
nonhomogeneous coordinates £;;. If v is any valuation of center W*, then 
R.2>Q(W*)>o0*Do, whence the center W of v on V is at finite distance with 
respect to the nonhomogeneous coordinates £;. Similarly for V’, W’ and 
the ¢/. But then, if p* is the prime o0*-ideal of W*, the prime ideals p and p’ 
of Wand W’, in the rings 0 and o*, respectively, are necessarily the contracted 
ideals of p*, that is, p=p*/\o, p’=p*/\o’, and consequently W and W’ are 
uniquely determined by W*. Notice that the quotient rings Q(W) and Q(W’) 
are subrings of Q(W*): 


(6) QW) CQ(w*), QW’) 


To prove the second part of the theorem, let T7(W) = W’ and let us assume 


that W and W’ are at finite distance with respect to the nonhomogeneous 
coordinates £; and £/. Let v be any valuation of center W and W’ on Vand V’, 
respectively. The center W* of v on V* will be at finite distance with respect 
to the coordinates £;;, since R,.0, R,_o’ and therefore R, o*. 

Let us first consider the case in which V is locally normal. Since W is 
not fundamental for T, we have Q(W)2Q(W’)>Do’. Therefore Q(W)Do*, 
that is, 0, Since p*/\o=p, it follows that 0, contains the ring o}-, that 
is, 0(W)2>0(W*). Hence, by (6), Q(W) = Q(W*), whence W is regular for T*. 

To prove the theorem in the general case we first observe that if V and 
V’ denote derived normal varieties of V and of V’, respectively, then any 
derived normal variety V* of the join of V and V’ is in regular birational 
correspondence with any derived normal variety of the join of Vand V’. The 
proof is straightforward and consists in the obvious remark that the integral 
closure of the ring (0, 0’) is the same as integral closure of the ring (0:, 07 ), 
where 0; and o; are the integral closures of 0 and of 0’, respectively. Now 
let W be any of the irreducible subvarieties of V which correspond to W. To 
prove that W is regular for T* we have only to prove (Definition 6, II.3) 
that W is regular for the birational correspondence between V and V*. Since 
W, by hypothesis, is not fundamental for T, it follows (Definition 6) that W 
is not fundamental for the birational correspondence between V and V’. Since 
V is locally normal, it follows by the case just considered that W is regular 


517 
| 


518 OSCAR ZARISKI- [May 


for the birational correspondence between V and the join of V and V’. Since 
a derived normal variety of this join is, by the remark made above, in regular 
birational correspondence with V*, it follows that W is regular for the bira- 
tional correspondence between V and V*, as was asserted. 


Coro.iary. If P and P’ are corresponding points of V and V’, then there 
is only a finite number of points P* on V* which correspond to both P and P’. 
If V (or V’) ts locally normal, then the number of such points P* can be greater 
than 1 only if P (or P’) is a fundamental point of T (or of T-*). 


For if we identify the varieties W, W’ and W* of the preceding proof 
with the points P, P’ and P*, respectively, we see that p* must be a zero- 
dimensional prime divisor of the ideal 0*-(p, p’). Since this ideal is pure zero- 
dimensional, the number of possible ideals p* is finite. The second half of 
the corollary follows directly from the second half of the preceding theorem. 

If the ground field K is algebraically closed, the ideal 0*-(p, p’) is itself 
prime, whenever p and p’ are both zero-dimensional. Hence if K is algebrai- 
cally closed, then not only does every point P* of V* determine uniquely a 
pair of corresponding points P, P’ of V and V’, respectively, but, conversely, 
every such pair determines uniquely a point P* on V*. For this reason the join 
V* is often referred to in the literature as the variety of patrs of corresponding 
points of V and V’. 

If K is not algebraically closed then P* need not be uniquely determined 
by P and P’. The following is an example(?*). Let K be the field of real num- 
bers and let 2=K(x, y), where x and y are indeterminates. We take as V 
and V’ two planes given—in nonhomogeneous coordinates—by the general 
points (x, y:) and (x, y), respectively, where y:=y(x?+1) and x, =x(y?+1). 
Here we have: 0=K[x, y:], 0’=K[m, y], o*=(0, 0’)=K[x, y]. Let 
p=(x?+1, v1), p’=(m, y?+1). These ideals are prime and zero-dimensional 
in their respective rings and they represent corresponding points of the two 
planes V and V’. However, the ideal o*-(p, p’) is now the intersection of the 
following prime ideals: p* = (x?+1, y—x), p* =(x?+1, y+2). 

The join V* of two locally normal varieties V and V’ need not be locally 
normal. One may often find it convenient to pass from V* to a derived normal 
variety V* of V*. We may call V* the normal join of V and V’. 


(75) It should not be too difficult to find necessary and sufficient conditions in order that a 
given pair of corresponding points P, P’ determine uniquely a point P* of the join V*. The 
following is a sufficient condition. Jf A=o/p and A'=0'/p' are the residue fields of the points P 
and P’ respectively, then a least field A*/K containing 4/K and A'/K should exist such that its 
relative degree over K is the product of the relative degrees of A and of A’ over K. (It is not diffi- 
cult to see that this product is the maximum value for the relative degree of A* over K, and if 
that maximum is reached, then there exists, to within relative isomorphisms, only one least 
field which contains A and A’.) However, the above condition is not sufficient. In fact, no con- 
dition can be both necessary and sufficient which does not take into account the quotient rings 
Q(P) and Q(P’) themselves, besides the residue fields A and A’. 


1943] BIRATIONAL CORRESPONDENCES 519 


The usefulness of the join V* is due to the possibility of deriving properties 
of the birational correspondence between V and V’ by first passing from V 
to V* and then from V* to V’. In each of these two steps we are dealing with 
a birational correspondence between two varieties which has no fundamental 
elements on one of the varieties (on V*). Birational correspondences of this sort 
are easier to handle, and they in fact play an important role in the general 
theory and in applications. 

6. Further properties of fundamental varieties. 


THEOREM 12. Given an irreducible subvariety W of V there exists an algebraic 
subvariety of V’ which we shall denote by T[W] and which has the following 
properties: 

A. Each irreducible component of T|W] corresponds to W. 

B. Each irreducible subvariety W’ of V' which corresponds to W lies on 
T[W]. 


The variety T[W] shall be referred to in the sequel as the transform of W. 

Proof. Suppose that the theorem is true for V and the join V* of V 
and V’. Then we show that it is also true for V and V’. For let 7* de- 
note, as before, the birational correspondence between V and V* and let 
T*[W]=Wt+W#+ - - - +W3*, where each is irreducible. To each 
there corresponds on V’ a unique irreducible variety W/. Since, by hypothe- 
sis, T*(W)=W3, it follows that each of the varieties Wi, Wi,---, Wi 
corresponds to W. On the other hand, let W’ be any irreducible subvariety 
of V’ which corresponds to W, and let W* be an irreducible subvariety of V* 
which corresponds to both W and W’. By hypothesis, W*CT*(W), say 
W*CW;}. Passing to the corresponding subvarieties W’ and Wy of V’, 
we conclude that W’CWi, that is, Hence 
T[W]=Wi+Wi + - -- +Wi, where, of course, some of the h varieties W! 
may be embedded, so that the number of irreducible components of T[W] 
may actually be less than h. 

We now prove the theorem for V and for the join V*. We use nonhomo- 
geneous coordinates. As nonhomogeneous coordinates for V* we can use the 
coordinates £;; of the preceding section, since the (n+1)(m+1) systems of 
coordinates 7;;/nas (a and 8 are fixed for each system) cover the entire pro- 
jective space in which V* is embedded. To prove the existence of the trans- 
form 7*[W], it will be sufficient therefore to exhibit that part L* of T*[W] 
which is at finite distance with respect to the coordinates £,;. Let 0, 0’ and o* 
' have the same meaning as in the preceding section. If W is not at finite 
distance with respect to the coordinates £;, then no W* which corresponds 
to W on V* can be at finite distance with respect to the coordinates £;; (since 
oCo*). Hence in this case L* is empty. If W is at finite distance, it is given 
by a prime ideal p in o. Let p*, ps*, - - - , px* be those minimal prime ideals of 
o*-p which contract to p, and let W# be the irreducible subvariety of V* 


520 OSCAR ZARISKI™ [May 


which is defined by p¥. I assert that L*= W*¥+W+ - - - +W,*. For in the 
first place, each W¥# corresponds to W. In the second place, if W*=T*(W) 
and if W* is at finite distance with respect to the coordinates £;;, W* is given 
by a prime o*-ideal p*, such that p*/\o=p. Since pCp*, we have o*pCp*, 
whence p* must divide some minimal ideal p’* of o*p. Since pCp’*Cp*, 
it follows that pCp’*/\oCyp, that is p’*/\o=p. Therefore p’* is one of the 
ideals p*, - and since p*>p’*, we conclude that W*CW?*+W? 
+ ---+-+W;. This completes the proof. 


Coro.uary 1. If the birational correspondence between V and V’ has no 
fundamental elements on V’, then to a fundamental variety W on V there corre- 
sponds on V’ at least one variety of higher dimension than W; in other words, 
T[W] is of higher dimension than W. 


For if there are no fundamental elements on V’ and if W’=7(W) then 
dimension W’=dimension W (Theorem 8, (B) and (C), II.3). If every W’ 
which corresponds to W were of the same dimension as W, then T[W] would 
be pure p-dimensional, where p =dimension W. But then the irreducible com- 
ponents of T[W] would be the only subvarieties of V’ which correspond to W, 
and since the number of these components is finite, the corollary follows, by 
Theorem 10, II.3. 

As a consequence, we have the following characterization of fundamental 
varieties : 


Coro.uary 2. If V and V' are arbitrary birationally equivalent varieties and 
if T and T* denote, respectively, the birational correspondence between V and V' 
and the birational correspondence between V and V*, then a given irreducible 
subvariety W of V is fundamental for T if and only if T*[W] is of higher di- 
mension than W. 


For W is fundamental for T if and only if it is fundamental for 7* (Theo- 
rem 11, II.5) and since 7* has no fundamental elements on V*, 

In addition to the transform T[W] we shall also have occasion to con- 
sider what we call the total transform of W and that we shall denote by 
T{W}. By that we mean the locus of points of V’ which correspond to points 
of W. That T{ W} is an algebraic variety is seen as follows. As in the case of 
T[W], so also here it is sufficient to show that 7*{ W} is algebraic, where 
T* is, as usual, the birational correspondence between V and the join V* of 
V and V’. Now if we consider that part L* of T*{W} which is at finite 
distance with respect to the nonhomogeneous coordinates £;; we see immedi- 
ately that L}* is the algebraic subvariety of V* which is defined by the ideal o*-p. 
For a point P* of V*, at finite distance with respect to the &;;’s, corresponds 
to a point P on W, if and only if the corresponding 0-dimensional prime 
o*-ideal p* satisfies the relation: p*/\oDp, that is, if and only if p*Do*-p. 

The irreducible components of T*{ W}, at finite distance, correspond to 


1943] BIRATIONAL CORRESPONDENCES 521 


the minimal prime ideals of 0*-p. We have seen that the irreducible compo- 
nents of 7*[W], at finite distance, correspond to those minimal prime ideals 
of o*p which contract to p. Hence T*[W] lies on T*{W}, and from this 
follows immediately that also T[W] lies on T{ W}. This result can also be 
deduced directly from property B, stated in II.1. In the same fashion one 
sees immediately that 7°{ W} also has the following property: Jf WiGW, and 
if Wi =T(W;), then Wi CT{W}. 

We point out explicitly that T[W] may very well be a proper subvariety 
of T{W}. For instance, if T is a plane Cremona transformation and if W is 
a curve containing a fundamental point P, then T{W}=T7[W]+TI’, where 
I” is the irregular (“fundamental” in the old terminology) curve which corre- 
sponds to the point P. Here T[W] is either a curve (if W is regular) or a 
fundamental point (if W is irregular)(*). Quite generally, we have the follow- 
ing theorem: 


THEOREM 13. Any irreducible component of the total transform T{ W} which 
is not a component of the transform T[W] must correspond to a proper sub- 
variety W, of W. Moreover, if V ts locally normal, then W, must be a funda- 
mental variety. 


Proof. In the proof of Theorem 12 we have seen that if 7*[W] 
=Wt+Wi+ ---+WiF# and if W/ is the subvariety of V’ which corresponds 
to W¥, then T[W]=Wi+Wi+ ---+W4. In a similar fashion it is seen 
immediately that if T*{W}=W#+W#+ ---, then 
T{Wh=Wi+Wit ---. Now let be an irreduci- 
ble component of T{W} which does not belong to T[W]. Then Wé -must 
correspond to an irreducible component W¢# of T*{ W} which does not belong 
to T*[W]. Now by definition of T*{W}, each point of Ws must correspond 
to some point of W. Since to a point of V* there corresponds a unique point 
of V, it follows that the subvariety Wo of V which corresponds to Ws must 
lie on W. It must be a proper subvariety of W, since W*#DT*[W]. Now 
both Wg and W, correspond to W¢*, whence they correspond to each other. 
It remains to prove that Ws is fundamental for T. If Wo were not funda- 


(28) This example shows therefore that the ideal o*p may very well possess minimal prime 
ideals which contract in 0 (not to p but) to proper divisors of p. In this connection we wish to cor- 
rect a statement on p. 135 in Krull’s Ergebnisse report Idealtheorie. The theorem stated on that 
page consists of three parts, and in the first part it is asserted, among other things, that p=p- §. 
This (and only this) assertion is incorrect. The rings $ and § play there the role of our rings 0 
and o*. It is quite true that there is only one prime ideal ) which lies over p (that is, such that 
5/\3 =p), and it is also true that p is a minimal prime of the ideal §-p. The equality $5 =3, 
shows clearly that the case under consideration corresponds to a regular W. Nevertheless, §-p 
may have minimal prime ideals other than p, as was pointed out above. Already the quad- 
ratic transformation x’ =x, yy’ =y/x may serve very well as a source of simple counterexamples. 
The formal source of that incorrect statement made in Krull’s report. is the erroneous assertion 
made earlier on the same page (line 12) to the effect that if the rank equals m then Ly is a field. 


A 
| 


522 OSCAR ZARISK] (May 


mental, ‘then by a stronger reason W would not be fundamental, since 
Q(W)DOQ(W2) (see Theorem 8 (E)). But then we would have T[W.]CT[W], 
a contradiction. 

If V is not locally normal, one passes to the derived normal varieties of V 
and of V* and the rest of the proof is straightforward. 


Coro.uary. If no subvariety of W is fundamental, and if V is locally normal, 
then the total transform of W coincides with the transform of W. 


We make one more remark about the transforms T[W] and T{W}. It 
is clear that in no case can these varieties be empty. However, it may very 
well happen that with respect to a given system of nonhomogeneous coordi- 
nates either T[W] or even T{ W} is entirely at infinity. Referring to the 
join V* and to the rings 0 and o* considered above, we see that T*[W] is 
at infinity if o* does not contain prime ideals which contract to » (in the termi- 
nology of Krull: p is lost in o*, see [2, p. 134]). If also T*{ W} is entirely at 
infinity, then the ideal o*-p is the unit ideal. 

7. The main theorem. If the birational correspondence between V and V’ 
has no fundamental elements on V’, and if W is fundamental for 7, then, 
by Theorem 12, Corollary 1, the transform T[W] has at least one component 
of higher dimension than W. In the general case, that is, when V is an arbi- 
trary variety, that is the best result one may claim, since it is quite possible 
for T[W] to possess components which have the same dimension as W. How- 
ever, in the case in which V is locally normal at W we have the following im- 
portant theorem: 


MAIN THEOREM. If W is an irreducible fundamental variety on V of a bi- 
rational correspondence T between V and V’ and if T has no fundamental ele- 
ments on V’, then—under the assumption that V is locally normal at W—each 
irreducible component of the transform T|W] is of higher dimension than W. 


CorOLiary. In the more general case in which T has fundamental elements 
on both V and V' and under the assumption that W is fundamental for T and 
that V is locally normal at W, the transform T*|W] of W on the join V* of V 
and V' has the property that all its irreducible components are of higher dimen- 
sion than W. 


We shall first give the main theorem another formulation which is more 
directly algebraic. Since V is locally normal at W, it is permissible to replace 
V by a derived normal variety of V (Theorem 7, Corollary 2, II.2). Hence 
we assume that V is a normal variety. Since there are no fundamental ele- 
ments on V’, the birational correspondence between V’ and V* is regular 
(Theorem 11, II.5). Hence it is permissible to replace V’ by V*. When these 
preparations are carried out then the main theorem expresses a feature of the 
the relationship between the ideals in the two rings 0 and o* considered in 
II.5, that is, we have to prove the following theorem: 


{ 


1943] BIRATIONAL CORRESPONDENCES 523 


THEOREM 14. Let 0 and o* be two finite integral domains with the same quo- 
tient field 2, where we assume that the ring 0 is integrally closed and that tt 1s a 
subring(?") of o*. Let p be a prime ideal in 0 and let p* be a prime ideal 0* which 
lies over p, that is, such that p*(\o=p. If p and p* have the same dimension, then 

(1) either the quotient rings 0» and 05+ coincide(?*) (and in this case p* ts 
obviously the only prime 0*-ideal which lies over p) or 

(2) p* ts not minimal with respect to the property of lying over p, that is, 
there exists in 0* another prime ideal which also lies over p and which is a proper 
multiple of p*. 


The proof of this theorem is rather long and will be developed in this and 
in the following section. That part of the proof which is contained in this sec- 
tion consists of three steps: (a) a reduction to a simpler special case; (b) a 
reference to the theorem of Krull which we have already mentioned(**); 
(c) alemma. 

(a) Let o’* be the integral closure of 0* in 2. It is well known that over 
every prime ideal p* in o* there lies at least one prime ideal p’* in o’*, and 
that the number of such ideals p’* is finite. Moreover p* and p’* have the 
same dimension and it is clear that if p*/\o=p, then 0,Co0}-Co,*. From this 
it follows that if our theorem is true for the pair of rings 0 and o’*, then it is 
also true for 0 and 0*. Hence we may assume that o* is integrally closed. 

Since 0 and o* are finite integral domains, o* is a finite ring extension of o. 
We prefer to think of o* as a ring obtained from 0 by a finite number of simple 
ring extensions, each ring extension being followed up by the operation of integral 
closure. Let therefore: 


0: = ola;], of = integral closure of 0; 


02 = of [az], of = integral closure of 02; 


Om = Om—1[am], Om = 0* = integral closure of Om. 


Let us assume that our theorem is true when m=1. Then we show, by induc- 
tion with respect to m, that the theorem is true for any value of m. Let 
Since we have: dimension »p* 2dimension 
Pm-12dimension p. If p and p* have the same dimension, then it follows 
that also p,_1 has the same dimension as p. Now let us also assume that no 
proper prime multiple of p* lies over p. Then no proper prime multiple of p* 
can lie over p»-1, and therefore, by the case m=1, we have that 0}. coincides 
with the quotient ring of Pm—1 im Om—1. Consequently, there is a (1, 1) corre- 
spondence between the prime multiples of p* in 0* and the prime multiples 


(?”) It will be seen from the proof that these assumptions can be weakened as follows: 
0 has a finite degree of transcendency over the ground field and is a finite discrete principal order 
(see Krull [2, p. 104]); o* is a finite ring extension of 0. 

(?8) This case arises when W is not fundamental, therefore regular. 


4 


524 OSCAR ZARISKI [May 


of pas in Om—s. (See 1.2.) Therefore it is equally true that no prime multiple 
of pm-i lies over p. By our induction, we conclude that also 0, coincides with 
the quotient ring of Pm-1 im Om—1. Hence 0,=05, as was asserted. 

We therefore have only to prove our theorem in the following special case: 
o* is fal integral closure of a ring 0’, where 0’ is a simple ring extension of 0: 
o’ =o|a}. 

(b) In this special case we shall make use of a theorem stated in Krull 
[2, p. 135], to which we have already referred in the preceding section(**). 
We write the principal fractional ideal 0-a@ as a quotient of two integral ideals: 
o-a@=3/n, where 3 and n are symbolic power products of minimal prime 
ideals in 0, without common factors. Krull distinguishes three cases: 
(1) n=0(p), 340(p); (2) n#0(p); (3) n=O(p), 3=0(p). 

In the first case the element 1/a is a non-unit in 0, and from this it follows 
immediately that p is lost in 0’, that is, no prime ideal in 0’ contracts to p. 
But then » is also lost in 0*, since o* is the integral closure of 0’, and in this 
case there is nothing to prove. 

In the second case a@ is contained in the quotient ring 0,, whence 0* Co,, 
since 0, is integrally closed. From this we conclude that tie two rings 0,, 
oy: coincide(*), This is the alternative (1) of the theorem. 

The really significant case is the third one. In this case Krull’s result is to 
the effect that o’-p ts a prime ideal p’, that p’ lies over p and that the dimension 
of p’ is one greater-than the dimension of ». We shall make use of this result. 

(c) In addition to the above result which concerns the relationship be- 
tween the ideal theory in 0 and in 0’, we shall have to make use, in a very 
essential fashion, of the following property of the conductor € of 0’ with re- 
spect to o*: 


LemMaA 6. Each prime o*-ideal p* of the conductor © has the property that 
:t contracts in 0 to a prime ideal of lower dimension. 


Proof. The lemma implies in particular that € has no sero-dimensional 
prime ideals. Let us assume that this particular consequence of the lemma has 
been established and let us show that then the lemma follows by the usual 
device of ground field extension. 

Let p* be a prime o*-ideal and let p*/\o=p. We assume that p and p* 
have the same dimension, say dimension s. It shall now be shown that p* can- 
not be a prime ideal of €. 

We select in 0 a set of s elements [, {2, - - +, & which are algebraically 
independent modulo p. We adjoin these elements to the ground field K getting 
a new ground field K,=K (f1, f2, - , and also the new rings: 01:=K,-o0, 
of =K,-0’, o* =K,-0*. We point out that o# is a quotient ring of o*, namely 
o* =og*, where S is the set of all polynomials in {2, - - with coeffi- 


(**) The proof is exactly the same as the proof which in I1.2 led us to the conclusion that 
the quotient ring of m,’ in the ring 3* coincides with $f. 


1943] BIRATIONAL CORRESPONDENCES 525 


cients in K. Similarly we have: 0: =0s. We therefore can apply the properties 
of the correspondence between the ideals in a given ring R and a quotient 
ring Rs, as described in I.1. We find then that p:=0,-p and pj =o0;*- p* are 
prime ideals and €;=K,€ is the conductor of 0,’ with respect to oj*. We have 
o/ =0; [a] and it is clear that 0; is the integral closure of 0; . Since pi* is zero- 
dimensional over the ground field Ki, it follows, by our assumption, that 
€,:p*=€,. On the other hand we have @,: p*=K,-(€:p*). Hence the two 
ideals © and €:p* have the same extended ideal in o;*. It follows (see II.1(*)) 
that these two ideals can only differ by primary components whose associate 
prime ideals contain polynomials in , Since the ¢’s are algebrai- 
cally independent modulo p*, we conclude that p* is not among the prime 
ideals of ©, as was asserted. 

The proof of the lemma is thus reduced to the matter of proving that the 
conductor € does not possess zero-dimensional prime ideals. 

Let p* be a zero-dimensional prime ideal in o*. We have to prove that 
€:p*=€. Let {* be an arbitrary element of €:p*. We denote by f(x) the 
irreducible polynomial in K[x] such that f(a)=0(p*). We have then: 
=0(C), whence {*f(a)-*€o’, for any element in o*. Hence we may 
write 


(7) f(a) = G(a) = won’ + + + ,, w; E o. 
We divide through G(x) by f(x): 
G(x) = A(x)f(x) + R(x), 


where all polynomials are in o[x] and where R(x) is of degree at most m—1, 
if m is the degree of f(x). We now rewrite (7) as follows (notice that f(a) ~0, 
since a@ is not in 0 and since 0 is integrally closed): 


f(a) 


Let v be an arbitrary valuation of 2 whose valuation ring R, contains o. 
If v(a)20, then ofa]ER,, and also o*CR,. Hence, by (8), we have 
R(a)/f(a)ERo. If v(a) <0, then v(R(a))>v(f(a)), since R is of less degree 
than f and since v(f(a)) =mv(a) (the leading coefficient of f(a) is an element 
of K). Hence also in this case R(a)/f(a) is contained in R,. Since this holds 
true for any valuation v such that oCR, and since 0 is integrally closed, we 
conclude that R(a)/f(a)€o. Hence, by (8), {*n* Co’, for any element n* in o*. 
Consequently ¢*€G. Since {* was an arbitrary element of €:p*, it follows 
that €: p*=€. This completes the proof of the lemma. 

8. Continuation of the proof of the main theorem. To prove the main 
theorem, or better, the equivalent Theorem 14, we shall proceed as follows. 
We assume that we have the special case described in the preceding section 
under (a). Let p* be a prime ideal in 0* and let p*/\0 = p. We shall also assume 


(8) g*n* = A(a) + 


i 

| 

4 

ay 


526 OSCAR ZARISKI : [May 


that we are dealing with the significant case n=0(p), 3=0(p), in which case 
we have the result of Krull as stated in the preceding section under (b). We 
shall prove that if p and p* have the same dimension, then p* contains prop- 
erly another prime ideal p* with the property: pi*/\o=p. This is the second 
alternative of Theorem 14. 

We divide the proof into two parts, according as € 0(p*) or €C=0(p*). 

First case: € 40(p*). Let = pi . Since € 40(py ), it follows in an ele- 
mentary fashion from the very definition of the conductor, that p* is the 
only prime ideal in o* which contracts to p/ and that the quotient rings 
03+, Oy, coincide. We have pi (\o=p, whence p; must be a divisor of the prime 
ideal p’ =o0’- p. It must be @ proper divisor of p’, since pi is of the same dimen- 
sion as p, while by Krull’s result p’ is of dimension one greater than p. Now 
since €40(pi ), we have a fortiori © 40(p’). Hence there is a unique prime 
ideal p’* in o* which contracts to p’ and we have o¥+=o,. Since we have also 
o%- =o, and since pi is a proper divisor of p’, it follows that also p* is a 
proper divisor of p'*. Since p’*(\0= p’(\0 =p, our proof is complete. 

Second case: €=0(p*). In this case p* is either a prime ideal of the con- 
ductor € or properly contains a prime ideal of ©. Since, by hypothesis, p* con- 
tracts in o to an ideal p of the same dimension as p*, the first possibility is 
excluded by our lemma. Hence p* properly contains a prime ideal p* of C. 
Let and let 

dimension ) = s, dimension p: = 5}. 


Since p*>pi*, we have p>p.. If pi=p then the theorem is proved, since we 
have now a proper multiple p;* of p* which also contracts to p. We assume 
therefore that p))1, whence s;>s. By our lemma, p¥ is of greater dimension 
than p:; by Krull’s theorem, the dimension of p* is at most one greater than 
the dimension of p:. Hence the dimension of p#* is s:+1. Our proof would be 
complete if we could show that there exists a prime ideal in 0*, between p* 
and p* (and different from p*) which contracts to p. This we proceed to show. 

We pass to the residue class rings O=0/p:, O* =o0*/p*. Both rings are 
finite integral domains and © is a subring of ©*. The first ring is of degree 
of transcendency s:, while ©* is of degree of transcendency s:+1. In the 
homomorphisms 0o~©, 0*~*, the prime ideals p and p* are mapped, re- 
spectively, onto prime ideals $ and $*, of the same dimension s, s<s,, and 
we have: $*/\O=§. What we have to prove is the existence in 0* of a 
prime ideal which is a proper multiple of $* and which contracts in © to 
the ideal %. The assertion that such an ideal exists is equivalent to the asser- 
tion that * is not a minimal prime of the extended ideal D*- f. To prove this 
we pass to the quotient rings $= Og, J* = O§-. If mand m* denote the ideals 
of non-units in these two rings, then our problem is to prove that m* is not a 
minimal prime ideal of the extended ideal §*-m. The proof of this will com- 
plete the proof of the main theorem. 


1943] BIRATIONAL CORRESPONDENCES 527 


Since & is of degree of transcendency s:, we can find s,;—s elements 
$1, in Y such that the ideal A= 2, - - , be exactly 
s-dimensional. Since the ideal of non-units m in & is also s-dimensional, it 
follows that & will then be necessarily a primary ideal, with m as associated 
prime. Now consider the ideal = 3*- , Since 3* is of de- 
gree of transcendency s:+1 and since A* is not the unit ideal (since 
¢:;€©A=0(m), whence A*=0(m*)), every minimal prime of Y* is of dimension 
at least s+1. Let $* be a minimal prime of A%*. Since A*=0($*), we have 
B*OVIDA, whence $*/\Y=m, for WA is primary and its associated prime is 
the ideal m of non-units. Hence $*> $*m, and this shows that m* is not a 
minimal prime of 3*-m, since $* is a proper multiple of m* (dimension 
m*=s, dimension $* =>s+1), q.e.d. 

In the main theorem we have assumed that the birational correspondence 
has no fundamental elements on V’. In the general case of an arbitrary pair 
of birationally equivalent varieties V and V’ we may apply the main theorem 
to V and to the join V* of Vand V’. If we then take into account Theorem 11 
of II.5 we deduce the following corollary which expresses the local character 
of the main theorem: 


Coro.iary. If W is an irreducible subvariety of V at which V ts locally 
normal and if W is fundamental for the birational correspondence T between V 
and some other variety V', then each irreducible component of T|W] which is not 
fundamental for T—' is of higher dimension than W. 


9. The fundamental locus of a birational correspondence. The forms 
o0(n), $1(7), , which are proportional to the coordinates né, m’, 

-, m of the general point of V’ (see equations (4), II.5) define a linear 
system of forms: 


ox = Aoho + + + 


We shall allow the parameters A; to take arbitrary values (not all zero) in 
the relative algebraic closure K’ of K in 2, that is, the \’s shall be elements 
of 2 which are either in K or algebraic over K. We shall also assume that V 
and V’ are locally normal varieties. Under this assumption it is permissible 
to identify K with K’, since any ring of nonhomogeneous coordinates of the 
general point of a locally normal variety is integrally closed and consequently 
contains K’. We therefore assume that K itself is algebraically closed in 2. 

The principal ideal in the ring K m, ---, mm] is (r—1)-dimen- 
sional. Since V is locally normal, the conductor of this ring with respect to 
its integral closure is a primary irrelevant ideal (or the unit ideal). Hence, 
to within an irrelevant component which we shall disregard, the ideal (¢)) 
is quasi-gleich to a product of symbolic powers of minimal prime (homogene- 
ous) ideals. In particular, let 


| 

4 

| 

/ 


528 OSCAR ZARISKI [May 


(9) MA;, +=0,1,---,m, 


where Wo, %:, - - - , Xn have no common factor. Then M will be the h.c.d. of 
ali principal ideals (¢,), that is, we will have: (,) = M-A~). The ideal Aa) de- 
fines a pure (ry —1)-dimensional subvariety €) of V, which may be reducible 
and in which each irreducible component is counted to a definite multiplicity 
(equal to the exponent of the corresponding prime factor of %,)). As the \’s 
vary in K, the variety Ca) varies and describes a linear system | Cc | of (r—1)- 
dimensional varieties on V, free from fixed components since Mo, %1, - + - , Um 
have no common factor. We have in particular the members Co, Ci, - + - , Cn 
of |C| which correspond to the ideals Mo, - - , 

Let F be the algebraic subvariety of V defined by the ideal 
= (Mo, ---, An), or rather, by the radical of this ideal. The variety 
F is of dimension at most r—2 and is common to Cp, Ci, ---, Cm. We 
show that F is the base manifold of the linear system | C|, that is, that F lies 
on each Cy). (This is not obvious, because of the presence of the factor M.) 

Let p be a minimal ideal of § and let us show that the assumption 
Wa) #0(p) leads to a contradiction. Let 6B be an element of Yq) not in p. 
Since ¢;=0(M), it follows(*°), by (9), that 1, - - m. Let 
8¢:=Bide, Bs:EK[n]. This relation can be written as follows: BU;=B Aa). 
By hypothesis A;=0(p), but Aa), 40(p). Hence B;=0(p), and therefore the 
relations yield relations of the form 


BO: = Bioho + Birdi +--+ + Bimdm, 0,1,--- 


where 6;;=0(p). From this we conclude that the determinant A= 3 
vanishes (6;;=0 if 5;=1), and this is impossible since A=+/"(p), 
whence A #0(p). 


THEOREM 15. The base manifold F of the linear system | C| is also the funda- 
mental locus of the birational transformation T, that is, F has the property that 
any irreducible subvariety W of V which is fundamental for T lies on F, and 
conversely. 


Proof. Let p be the homogeneous prime ideal in the ring K[n] which corre- 
sponds to W. 

Assume WCF. Then at least one of the m+1 varieties C; does not con- 
tain F. Let, say, 72Co, whence %,40(p). We introduce the nonhomogeneous 
coordinates £/ =n//n¢ of the general point of V’ and we denote, as usual, 


(*) Strictly speaking, the congruences 8 =0(Y%)), ¢; =0(M) do not necessarily imply that 
8¢; is a multiple of ¢q), since the equation (9) is only true to within an irrelevant component. 
However, in view of the regularity and the (1, 1) character of the birational correspondence 
between a locally normal variety and its derived normal variety, it is permissible to give the 
proof under the assumption that V and V’ are not only locally normal but normal. If V is nor- 
mal, then the equation (9) is exact. The same remark applies to the other theorems proved in 
this section. 


1943] BIRATIONAL CORRESPONDENCES 529 


the ring K[#/, -- by 0’. We have whence ¢/ €Q(W). 
Thus the entire ring 0’ is contained in the quotient ring Q(W), and this im- 
-plies that W is not fundamental. 
Conversely, assume that W is not fundamental. There corresponds then to 
W a unique variety W’ on V’ and we have Q(W’)CQ(W). We may assume 
that n¢ #0 on W’, and then we will have o’CQ(W’), whence a fortiori 
o’CQ(W). The fact that the quotients ¢;/¢p all belong to Q(W) leads immedi- 
ately to the conclusion that %#0(p). Hence W does not lie on F, q.e.d. 
While Theorem 15 gives full information about the location of the funda- 
mental elements of a birational correspondence, the following theorem tells 
us where the irregular varieties are located: 


THEOREM 16. If an irreducible subvariety W’ of V’ is irregular for T—' then 
W’ lies on the total transform T { F} of the fundamental locus F of T. Conversely, 
if W’ lies on T{ F}, then it is either irregular or fundamental for T-. 


The proof is immediate. For if W’ is irregular, and if W=7-'(W’), 
then W is fundamental (Theorem 8 (C), II.3), whence WCF. Consequently, 
 W=T( W)CT{ F}. Conversely, if W’ CT{ F} , then W’ must correspond to 

some irreducible subvariety W of F. Since W is fundamental, W’ cannot be 
regular, q.e.d. 

The linear system | Cc | which is defined by the linear family of forms ¢a) 
has always played an important part in the study of the birational corre- 
spondence T with which this system is associated. Theorem 15 is one illustra- 
tion of the geometric connection between | C| and 7. Another property of 
the linear system | C| which follows in a straightforward fashion from the 
definition is the following: the birational correspondence T iransforms the linear 
system | Cc | into the system of hyperplane sections of V’. This statement should 
be intended in the following sense: a general member Ca, of | C| is reguiar and 
T(Cay) is the section Ia) of V’ with the hyperplane Agyd + + 
=(0. However, for special values of the \’s, it may very well happen that Ca), 
or “oy, or both, contain irreducible components which are irregular and which 
therefore correspond to fundamental varieties. 

10. Isolated fundamental varieties. We assume as before that V is locally 
normal and we keep the notation V* for the join of V and V’, and 7* for the 
birational correspondence between V and V*. 


DEFINITION 8. Let - - Wik be the irreducible components of the 
total transform T*{ F}, where F is the fundamental locus of T* (and hence also 
of T) on V. The irreducible subvarieties F;=T*-'(W#) of V are called the iso- 
lated fundamental varieties of T (and also of T*). 


It is clear that the isolated fundamental varieties F; lie on the fundamen- 
tal locus F. It is also not difficult to see that the irreducible components of F are 
among the isolated fundamental varieties. For let W be an irreducible compo- 


| 
| 
] 
sl 
> 


530 OSCAR ZARISKI [May 


nent of F and let W* be an irreducible component of the transform 7*[W]. 
We have W*CT* { F } , since T*-!(W*) = WCF. Consequently W* lies on one 
of the varieties W*, say W* We have then 7*-'(W*) CT*-"(W#) = 
that is, WCF,. Since W is a component of F, we conclude that -,\= W. 

It is important to point out that in addition to the irreducible compo- 
nents of the fundamental locus F there may exist other “embedded” isolated 
fundamental varieties, which are proper subvarieties of the irreducible com- 
ponents of F. Thus in the three-dimensional case we may have a fundamental 
curve I on V to which there corresponds a surface on V*, and on that funda- 
‘mental curve I’ there may exist some special point P to which there also 
corresponds a surface on V*. This point P must be regarded as an isolated 
fundamental point, although it is embedded in the fundamental curve [’. The 
term “isolated” refers not to the position of the point P with respect to the 
fundamental locus F but to its role in the birational correspondence T. 

By the main theorem each irreducible component W# of T*{ F} is of 
higher dimension than the corresponding isolated fundamental variety F;. 
Under certain conditions it is possible to assert that each W¥ is of dimension 
r—1. We proceed to find such conditions. 

Let ni; denote as usual the homogeneous coordinates of the general point 
of the join V*, where :;=7@; (see II.5 and II.9). Let us consider quite 
generally an arbitrary homogeneous ideal (go, g:,--+, ga) in the ring 
K[m0, m, * ++, MJ], where each g; is a form, say of degree »;. We put 


(10) Big = Mg * i=0,1,---,4;7 =0,1,--+,m. 
The (m+1)(4+1) forms g;; in the ;;’s generate a homogeneous ideal in the 
ring K[n00, m0, * Mam]: 


LemMa 7. If N is the subvariety of V defined by the ideal (go, g1, « * - » Zn) 
and if N* is the subvariety of V* defined by the ideal (goo, - - - , Zim), then N* 
is the total transform of N, that is, N*= T*{N} ; 


Proof. Let W and W* be two corresponding irreducible subvarieties of V 
and V*, respectively. We have to show that W*CN*%, if and only if WCN. 

Assume that WCW. Without loss of generality we may assume that 7)#0 
on W. Then if v denotes a valuation of centers W and W*, we will have: 


(11) v(ns/no) 20, > 0. 


Also without loss of generality we may assume that v(¢;/¢0)20, for 
i=1,2,---+,m. Wewill have then v(7:;/n00) =v(9:/no) 20. By (10), 
we can write: 

(12) Bis = 


and hence gi;/noo =8s/no (;/b0)”*. Consequently >0, by (11), and 
this shows that W*CN*. 


\ 
- 


1943] BIRATIONAL CORRESPONDENCES 531 


Conversely, assume that W*CN*. Then if mo+0 on W*, we find that 
on W (since 7i/no= ni0/Noo). On the other hand we have: 
whence v(g;/7;') >0, that is, WCN, q.e.d. 

We now apply the lemma to the case in which JN is given by the ideal 
(bo, oi, Om). Then N* is given by the ideal (Goo, dio, mm), Where 


The relations (12) now yield: ¢;;=¢:;, where v is the common degree of the 
form ¢;. From these relations we deduce the following: ¢;' =¢y¢y,, and conse- 
quently the two ideals (Moo, dio, and (hoo, Ou, mm) have the 
same radical. Therefore N* is also defined by the ideal (doo, du, - - - , Pmm)- 
Now we have for any i, j=0, 1,---, m; 
k=0, 1,---, m. Therefore each irreducible component of N* at which 
m3; 0, for some & and j, is also a component of the principal ideal (¢;;) and 
is therefore (r—1)-dimensional. Consequently N* is pure (r —1)-dimensional. 

In view of the formulas (9) of II.9, the variety of the ideal (Mo, di, - - +, bm) 
consists of the (r—1)-dimensional variety of the ideal Dt and of the funda- 
mental locus F. We therefore can assert that T*{ F} is pure (r—1)-dimen- 
sional if I is the unit ideal. The hypothesis P?=(1) implies that each member 
Ca) of the linear system lc | associated with the birational correspondence is 
complete intersection of V with a hypersurface of the ambient projective space, 
namely with the hypersurface Aobo(yo, Yn) °° * Vn) 

Conversely, let us assume that each Ca) is complete intersection. Then in 
particular Cy is complete intersection, whence the ideal Wo is a principal ideal, 
say %o=(Yo). We have: do/do= Ao/MAo=A;, that is, dalo/do is an 
integral ideal. Consequently the quotients ¢o0/@o are forms in the y’s, say 
o0/do=¥i. The forms Yo, ¥i,--+, Ym are proportional to the forms 
Pr, Om, and the linear system | Cc | is also defined by the linear family 
of forms Apho+Awit - If we use the instead of the ¢’s, we 
will have I=(1), and we reach again the conclusion that 7*{ F } is pure 
(r —1)-dimensional. 

We can go a step further. Let us point out that if we define a projective 
model V; by the condition that the homogeneous coordinates of its general 
point be given by a linear base of the forms of degree p in né, /, °° +, Te 
then V’ and V; are in regular birational correspondence (Lemma 5, II.2). 
The transition from V’ to V; is equivalent to passing from the linear system 
| C| to the least linear system which contains as members all sets of p C’s. 
Hence, by the preceding result, we conclude that ¢f a sufficiently high multiple 
of a C is complete intersection, then T*{ F} is pure (r—1)-dimensional. 

In order to conclude with a similar result of a local character, let Fi be 
any isolated fundamental variety of JT, and let us assume that C is locally, 
at F,, complete intersection. By that we mean that some hypersurface cuts V 


] 
¢ 


532 OSCAR ZARISKI ~ [May 


along C and along a residual variety which does not contain F,. If we replace 
the ¢’s by a suitable set of proportional forms, we may arrange matters so 
that the variety of the ideal I2 does not contain F;. Since N= M-+F, it is 
clear that 7*{N}=7*[M]+7*{F}, where we write T*[M] instead of 
T*{M}, since T*{M}--T*[M] lies on T*{F} (Theorem 13, II.6). Since 
F,E.M, no component of 7*[Fi] can lie on T*[M]. It follows that the irre- 
ducible components of 7*{ F} which correspond to the isolated fundamental 
variety F, are also componerts of 7*{ N}, and hence are (r—1)-dimensional. 
The same conclusion is reached if we assume that some sufficiently high multi- 
ple of C is complete intersection locally at F. 

The above results refer to V and to the join of V and V’. In particular, 
if the birational correspondence T has no fundamental elements on V’ then 
V’ may play the role of V*, since V’ and V* are then in regular birational 
correspondence. We reassume our results in the following theorem: 


THEOREM 17. If a birational correspondence T between two locally normal 
r-dimensional varieties V and V’ has no fundamental elements on V' and if F 
denotes the fundamental locus of T on V, then an irreducible component of T { F } 
is of dimension r—1, provided the corresponding isolated fundamental variety Fi 
has the property that the members of the linear system | Cc | associated with T, 
or their sufficiently high multiples, are complete intersections locally, at F,. 


CorROLLAaRY. To an isolated simple fundamental variety there always corre- 
sponds an (r —1)-dimensional variety on V' (see van der Waerden [6, p. 154]). 


For locally, at a simple subvariety of V, every (r—1)-dimensional sub- 
variety of V is complete intersection (*). 

11. Monoidal transformations. Given a homogeneous ideal Wf in the ring 
K[no, m, ***, | of homogeneous coordinates of the general point of V, 
it is possible to associate with Y an infinite set S of birational transforms of V 
such that: (1) the birational correspondence between V and any variety V’ 
of the set has no fundamental elements on V’ and such that (2) any two 
varieties of the set are in regular birational correspondence. The varieties V’ 
of the set S shall be defined as follews. Let us take a base of & consisting of 
forms of least possible degrees, and let a be the highest degree of the forms 
in that base. We define V’ by its general point (¢o, di, - - - , dm), Where the 
¢’s form a linear base for the forms of a given degree v in &% and where we 
impose on v the condition: y2a+1. For v=a+1, a+2,---+, we get an 
infinite set of models V’, and this is our set S. We shall denote these models 


(*) For the case of algebraically closed ground fields of characteristic zero see our paper 
[8, p. 664]. There the proof is given explicitly for surfaces only, but actually exactly the same 
proof applies to higher varieties. For ground fields which are not algebraically closed or which 
are of characteristic p, the statement can be derived from the following result obtained by Irvin 
Cohen in his dissertation: if the characteristic of a complete p-series ring coincides with the 
characteristic of its residue field, the ring is a power series ring over a field. 


1943] BIRATIONAL CORRESPONDENCES 533 


First of all it is clear that each V/ és birationally equivalent to V. For U con- 
tains at least one form y of degree y—1 so that the products no, my, - > >, mW 
can be identified with +1 of the ¢’s. This shows that the quotients ¢;/¢o 
generate the field 2. 

Since %& has a basis consisting of forms of degree at most a, it follows that 
if (bo, o1, - - - , dm) is a basis for the forms in A of degree v, then the products 
nw; constitute a basis for the forms in & which are of degree v+1. This holds 
true also for »=a. From this it follows that V,,: is the join of V and V,, 
provided v2a+1. Therefore the birational correspondence between V and V, 
has no fundamental points on V,, provided y2a+2. But then V,4:, the join 
of V and V,, ts a regular birational transform of V,, always provided that 
v2a+2. As for the case y=a+1, we can still regard V4: as the join of V 
and V,, although in this case V, need not be birationally equivalent to V. 
At any rate, the proof that the birational correspondence between V and 
Vay: has no fundamental elements on V4.4: is exactly the same as that given 
for the join in II.5. 

Thus we may say that a given homogeneous ideal YW in the ring 
K[n0, m, - determines, to within a regular birational transformation, 
a birational transform V’ of V such that the birational correspondence be- 
tween V and V’ has no fundamental elements on V’. 

Let N be the subvariety of V defined by the ideal &. It is quite clear that 
if y =a, then the ideal generated by our base - , dm) differs from 
only by an irrelevant component. Hence, by Theorem 15 of II.9, we con- 
clude that if N is of the dimension at most r—2, then N is the fundamental 
locus F of the birational correspondence between V and V’. In particular, 
if N is empty, that is, if Wis an irrelevant ideal, then V’ is a regular transform 
of V. 

If, however, N is of dimension r—1, then the fundamental locus F will 
consist of the irreducible components of N which are of dimension less than 
r—1 and possibly of some proper subvarieties of the (r —1)-dimensional com- 
ponents of WV. Thus, even in the case in which N is pure (r —1)-dimensional, 
it may very well happen that F is not empty. According to Theorem 15, 
this will happen if the residual intersections of the hypersurfaces ¢;=0 
(¢=0, 1, - + +, m) with V, outside N, have a base manifold on N. 

Let now %; be another homogeneous ideal in the ring K[mo, m, - + -, ma]. 
If W is an irreducible subvariety of V given by a prime ideal p, we shall say 
that & and YW, coincide locally at W, if the two ideals differ only by primary 
components whose associated prime ideals are not multiples of p. In other 
words, Y% and Y; coincide locally at W if they give rise to one and the same 
ideal in the quotient ring of W (see I.1). 


Lema 8. If & and Y; coincide locally at an irreducible subvariety W of V 
and if V' and V{ are the birational transforms of V which are determined (to 
within a regular birational transformation), respectively, by U and by Ui, then 


4 
i 
i 
| 
| 


$34 OSCAR ZARISKI [May 


any irreducible subvariety of V’ which corresponds to W is regular for the bira- 
tional correspondence between V’ and Vy. 


Proof. Let %* be the ideal which is obtained from either & or %, by the 
omission of all primary components whose associated prime ideals are not 
multiples of p. Let V* be the birational transform of V determined by the 
ideal %*. It is sufficient to prove the lemma for %& and %*, and for % and Y*. 
We shall prove it, for instance, for A and W*. 

Let W’ be an irreducible subvariety of V’ which corresponds to W. We 
have to prove that W’ is regular for the birationa! correspondence between 
V’ and V*. Let go, d1, - - - , @m be a linear base of the forms of degree v which 
belong to Y%. Since ACA*, we may complete this base to a linear base 
go, $1, * * * » for the forms of degree v which belong to 
the ideal We take sufficiently high, so that (¢o, - dm) and 
(ho, di, » Pm, * * * » are the general points, respectively, of V’ 
and V*. 

Let v be any valuation whose center on V’ is W’ and whose center on V 
is W. We may assume that v(¢;/¢0) 20, i=1, 2, - - +, m whence W’ is at 
finite distance with respect to the nonhomogeneous coordinates £/ =¢;/@o, 
t=1, 2, --+-,m, of the general point of V’. Let W* be the center of v on V*. 
By our definition of the ideal there exists a form g(%o, Mn) 
such that g- %*=0(%) and such that g~0 on W. We have then, in particular: 
- +Ambm, 7=1, 2,--+, where Ao, A1,-++, Am 
are forms in 7, 71, - * * , tin, Of the same degree as g. We write: 


(13) 
Since g#0 on W we have v(A;/g) 20, +=0, 1, - - - , m. Since also v(¢;/¢0) 20, 
it follows from the above relation (13) that v($ni;/¢0)20. Hence W* 
is at finite distance with respect to the nonhomogeneous coordinates 
Ef, Ens, of the general point of V*, where £/ =¢;/¢o. Since the 
ring K[t/, &,-- +, &] is a subring of the ring K[#/, &,- +--+, it’ 
follows that Q(W’)CQ(W*). 

On the other hand, since g¥0 on W, the quotients A;/g belong to the 
quotient ring Q(W). Since Q(W) Q(W’) and since also the quotients ¢;/¢o, 
a=1, 2,-+-+, m, are in Q(W’), we conclude, by (13), that the entire ring 
K[é/, &, , &e'4,] is contained in Q(W’). From this it follows immediately 
that Q(W*)CO(W’), whence Q(W*)=Q(W’), q.e.d. 


Coro.uary. If A and A, differ only by an irrelevant primary component, 
then V' and Vi are in regular birational correspondence. 


From the above general consideration we pass to the special case which 
interests us, namely to the case in which the given homogeneous ideal Y is a 


1943] BIRATIONAL CORRESPONDENCES 535 


prime ideal p, of dimension s, 0SsSr—2. Let W be the irreducible sub- 
variety of V defined by p. The birational transformation T determined by 
the ideal p (that is, by a linear base of forms of sufficiently high degree in p) 
is called a monoidal transformation of center W. In the special case when W is a 
point P the transformation is called quadratic (of center P). The birational 
transform V’ of V, under a monoidal transformation of given center, is de- 
termined to within a regular birational correspondence. The center W of a 
monoidal transformation is the fundamental locus of the transformation. 
Moreover, from Theorem 17, II1.10, it follows that in the present case T | od. 
is pure (r—1)-dimensional. However, it should be pointed out that T} W 
may very well be reducible and—this is significant—some components of 
T { Ww} may correspond to proper subvarieties of W. In other words, the center 
W of a monoidal transformation is not necessarily the only isolated fundamental 
variety of the transformation(**). We shall see presently that this complication 
arises only if W carries some singular points of V or if W itself has singulari- 
ties. 

Of special importance in applications are monoidal transformations with 
simple center, that is, with center at a simple subvariety W of V. The special 
case of a quadratic transformation with simple center has been considered 
in our paper [11]. The results established there carry over to monoidal trans- 
formations with simple center, in view of the following considerations. Let 
the ground field K be extended by the adjunction of s elements of Q(W) which 
are algebraically independent on W. With respect to this new ground field Ki, 
the variety W becomes a (simple) point and the monoidal transformation T 
becomes a quadratic transformation. Therefore certain properties of the mon- 
oidal transformation 7, over K, can be deduced from corresponding prop- 
erties of the quadratic transformation over Ki. However, only such properties 
of T can be deduced in this fashion as concern W as a whole. What happens 
to special points or special subvarieties of W requires new considerations. 
For instance, we have proved in the quoted paper [11] that if T is a quadratic 
transformation with simple center P, then the transform T[P] (which, since 
P is a point, coincides with the total transform T{P}) is an irreducible, 
simple and (r—1)-dimensional subvariety of V’ and, moreover, that every 


(#) Here is an example. Let V be the quadric hypersurface u*=-yz in the 4-dimensional 
space of the variables x, y, z, u. This hypersurface has the double line y=z=u=0. Let Wbe the 
line x=y=u=0. As nonhomogeneous coordinates of the monoidal transform V’ of V we can 
take the elements x, y, 2, u, x/u, y/u. Let o’ be the ring of these coordinates and let 
o=K[x, y, z, u]. We have o’=o0[x/u, y/u]=K[m, s], where x=x/u, Here 
p= (zx, y, u) is the prime ideal of W and we have 0’: p u=0’- that is, that part of T{ W} 
which is at finite distance consists of two planes: y, =0 and s=0 (note that the affine model V’ 
is in regular birational correspondence with the affine space of the variables x:, y:, z). The first 
plane corresponds to W (since 0’-7,/ \o=p). But the plane s=0 corresponds to the point x =y 
=z=u=0. This point is imbedded in W, but according to our terminology must be regarded 
as an isolated fundamental point. 


| 
4] 
| 
H 
i 


536 OSCAR ZARISKI [May 


point of T[P] is likewise simple for V’. Now when we pass from the ground 
field K to the ground field Ki, we lose all those components of T{ W} which 
cannot be regarded as varieties over Ki, that is, all those components of 
T{W} which correspond to proper subvarieties of W (since on any proper 
subvariety of W the s elements which have been adjointed to K are algebrai- 
cally dependent). Consequently, the correct extrapolation of the above result 
concerning quadratic transformations to monoidal transformations is the fol- 
lowing: 


THEOREM 18. If the center W of a monoidal transformation T is a simple 
subvariety of V, then the transform(**) T[W] of W is an irreducible, simple, 
(r—1)-dimensional subvariety of V’, and every irreducible subvariety W' of 
T[W] is also simple for V', provided W’ =T(W). 


The total transform T{ W} may possess components which are not compo- 
nents of T[W] (even if W is simple(**)), and concerning those components 
we can assert nothing. Likewise T[W] may contain points which are singular 
for V. Thus, if V is three-dimensional and if W is a curve, then T[W] is a 
surface which may carry, in addition to a finite number of singular points 
of V, also a finite number of singular curves of V, but each such curve must 
correspond to a point of W. 

The following theorem will show, among other things, that these compli- 
cations can arise only from points or subvarieties of W which are singular 
for V or for W. 


THEOREM 19. Let W, be an irreducible subvariety of W, of dimension s,. If 
W, is simple both for V and W, then T[W,] lies on T[W], is irreducible, is of 
dimension r—1--s+5, and is simple both for V’ and for T|W]. Moreover, every 
irreducible subvariety of T|W:] which corresponds to W, is likewise simple for 
V’, T[W] and also for T[W;]. 


Proof. By the usual device of ground field extension we can achieve a 
reduction to the case s:=0. Therefore we assume that W, is a point P of W, 
simple both for V and W. It is then possible to select uniformizing parameters 
hi, te, + ++, t at P in such a fashion that W be locally, at P, complete in- 
tersection of the r—s hypersurfaces(**) 4:=0, 4=0,--+-+, ¢.=0. Then 


(3) Not the total transform T{ Ww}! 

(*) Proof. Quite generally, the uniformizing parameters 4;, f2, * * * , trp of a simple p-dimen- 
sional subvariety L of V have the following property: if g(t, trp) a true homogene- 
ous relation between these parameters, with coefficients in the quotient ring Q(L), then all these coeffi- 
cients must be zero on L, that is, they are non-units of Q(L). (See [9, p. 202, (15) and p. 207, (23) ].) 
In view of this property and also because Q(Z) is a chain theorem ring in which the non-units 
form an ideal, the quotient ring of a simple subvariety is a p-series ring (p-Rethenring) in the 
sense of Krull [3]. We shall therefore apply properties of p-series rings due to Krull. 

Let $ denote the quotient ring of P and let m be the ideal of non-units in §. If 11,72, °° * , tr 
are uniformizing parameters of P, then we have m=: (nr, 72, ° * * , Tr). If ais any element of 3 
and if a=o(m"*), aso0(m**), then a can be written as a form ga (71, 72, ° * * , Tr), Of degree h, 


1943] BIRATIONAL CORRESPONDENCES 537 


th, te, + + +, tps will be uniformizing parameters for W (that is, the ideal gen- 
erated by 41, te, - - + , t--s in Q(W) will be the prime ideal of non-units), and 
the ideal generated by the same elements in Q(P) will be the prime ideal of W 
in Q(P). 

The uniformizing parameters 4, fz, - - - , f--., of W are proportional to cer- 
tain forms , in the homogeneous coordinates 0, m, °° 
of the general point of V. Since these uniformizing parameters belong to 
Q(P), it follows that the factor of proportionality can be so selected that the 
ideal generated by the forms (Yi, Yo, - - - , Wr) coincide locally at P with the 
prime ideal of W. Hence by Lemma 8 we can replace the transformation T 
by the birational transformation defined by the ideal(**) (Wi, We, - , 
Therefore we may assume that 7, instead of being our original monoidal 
transformation of center W, is the birational transformation which carries V 
into the variety V’ whose general point is (yo, * %n,r-s), Where(**) 
= 

Without loss of generality we may assume that the point P (and hence 
also W) is at finite distance with respect to the nonhomogeneous coordinates 
£:=n:/no, i=1, 2, Let I’ denote an irreducible component of T[P]. 
For some value of h, h=1, 2,-++, r—s5, it will be true that I’ is at finite 
distance with respect to the nonhomogeneous coordinates 


= 


with coefficients in $. If the coefficients of this form are replaced by their residues mod m, 


one obtains a form in 7, T2, °°, 7, with coefficients in the residue field $/m. The property 
of uniformizing parameters stated above implies that this form is uniquely determined by the 
element «. This form is called by Krull the leading form of a [3, p. 207]. 

It is a straightforward matter to show that r elements 4;, f2,° ++, te are uniformizing 
parameters of P, that is, , é), if and only if the leading forms of t), fz, + * , tr 
are linear and linearly independent. 

Let p denote the prime ideal of W in $ and let 3* =3/p, m* =m/p. Then 3* is the quotient 
ring of the point P, regarded as a point of W,and m* is the ideal of non-units of $*. Since, by 
hypothesis, P is a simple point of W, there exist s elements in $*, say *, such 
that m* tess, Let tp_241, * * tr be elements of whose p-residues 
are °°, respectively. We will have then: (p, tp241, * 
From this relation we draw the following consequences. In the first place it follows that the 
ideal p must contain r—s elements whose leading forms are linear and linearly independent. 
Let ti, t2, °° * , te be such r—s elements of p. If 3’ is the perfect closure of $ (see Krull [3, 
p. 217]), then it is a straightforward matter to show that the ideal (t;, ts, , is prime. 
Therefore also the ideal $-(t:, t2, °° * , tr-s) is prime, since it is the contraction of the ideal 
(th, ta, tps) (Krull [3, Theorem 15]). Since the leading ideal of tz, , tr—s) is 
of dimension s, it follows (Krull [3, Theorem 8]) that also 3° (t1, fs, * * * , t-s) is of dimension s. 
Consequently this ideal coincides withp. We have therefore: t2, , trey 
te), (hy, te, tre), q.e.d. 

(*) Note that if two ideals coincide locally at some W they also coincide locally at any Wi 
such that WCW. 

(®) Since our new transformation behaves locally at P as the given monoidal transforma- 
tion, we could refer to our new transformation as being locally monoidal at P. 


| 
4 
i 
i 
| 
i 
4 
q 


538 OSCAR ZARISKI [May 


of the general point of V’. Let 0, denote the ring of these nonhomogeneous 
coordinates, and let 0 denote, as usual, the ring K[&, &,---, &]. Since 
we find: 

(14) ox = o[ts/tn, 


Without loss of generality we may assume that I’ is at finite distance with 
respect to the nonhomogeneous coordinates £. For simplicity we shall drop 
the subscript 1 in the symbol o/, that is, we shall use the symbol 0’ to de- 
note the ring 0; . 

Let po denote the prime zero-dimensional 0-ideal of the point P. We shall 
denote by p the prime o0-ideal of W. For clarity of exposition we divide our 


proof into several steps. 

(1) We shall show first that the ideal 0’: po is prime. Let $=0,, denote 
the quotient ring of P and let 3’ = $-o’. The ring 3’ is a quotient ring of 0’, 
namely $’=o0s where S=o—po. Therefore, in view of the relationship be- 
tween the ideals in a ring and the ideals in its quotient ring, the ideal 0’ - po 
ts prime if and only if 3’- po is prime(*"). We prefer to deal with the ring 3’ 
and to show that 3: pots prime. 


Since po= fe, , ty) and 4;=4-t;/h, 3, - - r—s, that is, 
to, ts, + , are multiplies of ¢, in 0’, it follows that 
(15) - = 3’. (41, &). 
Any element a in &’ can be written in the form: 


where ¢, is a form of degree p in ti, te, , t--s, With coefficients in Let 
be another element in 3’, 


B Vo(t, te, ty-s)/th 


and let us assume that a8 =0(3%’- po). We will have a relation of the form: 


where f,, fc”, - - - ,f® are forms of degree yu, with coefficients in 3. The right- 
hand side of this relation is a form of degree p+-o+p-+1 in hy, te, - + - , te, with 


(?7) We have a (1, 1) isomorphic correspondence between the ideals in $’ and those ideals 
in o’ all prime ideals of which contract in 0 to Po or to multiples of ~o. Now since Po is a maximal 
ideal, every prime ideal of 0’- po contracts to Po. Hence 0’: po and 3’: Po are corresponding ideals 
in the above correspondence. 


| 


1943] BIRATIONAL CORRESPONDENCES 539 


coefficients in $. Hence by a well known property of uniformizing parame- 
ters(*) either all the coefficients of ¢, or all the coefficients of y, must be 
elements of %- po. Suppose that all the coefficients of ¢, are in $- po. Since 
¥-po=¥-(t, te, - ++, te), we will have for ¢, an expression of the form: 
where the ¢/ are again forms of degree p, with coefficients in $. Since 
to, - + +, tps are multiples of 4: in $’, we conclude immediately that ¢,/f 
is contained in the ideal 9$’(t, t—s41,°-°+, ¢), that is, in view of (15), 
a=0(%’- po). This shows that &’- po is a prime ideal, as was asserted(**). 

(2) Let o’po=p’. We assert that the p’-residues of te/ti, +++, tro/ti are 
algebraically independent (over K). For a relation of algebraic dependence be- 
tween these residues would imply a relation of the form: 


= 
te, bps) /ty > like (41, te, /tiy 
i=1 


where the g? and @¢, are forms in fz, - ++, with coefficients in 0, and 
where the coefficients of d, are not all in po. Such a relation, cleared of the de- 
nominators, is in contradiction with the property of uniformizing parameters 
stated above(**). 

From the fact that 0’- po is prime, follows that I’ ts the only irreducible 
component of T|P] which is at finite distance with respect to the coordinates §). 

The fact that the p’-residue of f2/t, - - - , ¢--./t are algebraically inde- 
pendent, in conjunction with the fact that p’(\o is the zero-dimensional ideal 
Po, implies that I’ is of dimension r—s—1. Moreover, the algebraic independ- 
ence of the quotients /2/t:, s/t, - - - , t--./t: implies in particular that they 
do not belong to p’. Hence these quotients are units in the quotient ring 
Q(T’). But then also ¢;/t,E Q(T’), for +, h=1, 2, - - - , r—s, whence the rings 
ox (see (14)) belong to Q(T’). This shows that I’ is at finite distance also with 
respect to the nonhomogeneous coordinates &), for h=1, 2, - - - , r—s. Conse- 
quently, I’ is the only irreducible component of T[P], that is, T[P] is irre- 
ducible: T[P]=T’. 

(3) Let C’ denote the irreducible (r—1)-dimensional variety T[W]. We 
are interested in the quotient rings of I’ and of C’. On the basis of the preced- 
ing considerations we find immediately that every element of Q(T’) is of the 
form: f,(t1, te, , tr—s)/Zo(ts, te, » tps), where f, and g, are forms of like 
degree p, with coefficients in 0, and where the coefficients of g, are not all in po. 
Similarly, making use of the remark in footnote 38, or also directly from the 
properties of what we have called “p-adic divisor” in [11], we conclude that 
the elements of Q(C’) are all of the form f,(t, te, tr—s)/Zp(t, te, +, tre), 

(88) Exactly the same proof could be applied toward proving the following: if p is the prime 
deal of W in o and if $ =Q(W), 3’ =- 0’, then the ideal $’- p( =’: #4) is prime. From this we 
could conclude that T[W] is irreducible (as asserted in Theorem 18) in exactly the same fashion 
as we concluded in the text that T[P] is irreducible. 


540 OSCAR ZARISKI [May 


with the same conditions on f, and g, as above, except that now the coeffi- 
cients of g, must not all be in p, where p is the prime ideal of W. Since PCW, 
it follows that Q(I’)CQ(C’), that is, T[P] les on T[W] (that is, I’ lies 
on C’). 

Moreover, from the preceding considerations (see relation (15)), it follows 
that the prime ideal of non-units in Q(T’) has the basis hy, t-.41, - + + , ty, con- 
sisting of s+1 elements. Since I’ is of dimension r—s—1, it follows that I’ 
is a simple subvariety of V’, and that are uniformizing pa- 
rameters of I’. In a similar fashion we find that 4; is a uniformizing parameter 
of the (r—1)-dimensional variety C’, and since 4; is among the uniformizing 
parameters of I’, we conclude that I’ is a simple subvariety of C’. 

(4) To complete the proof of our theorem we have only to show that every 
point P’ of I’ is simple for V’, C’ and I’. To show that P’ is simple for V’ 
we have to exhibit r uniformizing parameters at P’. Let A be the residue class 
field of the point P, that is, let A=o0/p. Similarly let A’ be the residue class 
field of P’. Here A and A’ are finite algebraic extensions of K, and A’DA 
since Q(P’)>Q(P). Without loss of generality we may assume that P’ is at 
finite distance with respect to the ring 0’ of nonhomogeneous coordinates 
(o’ =o, see (14)), and is therefore given in 0’ by a prime zero-dimensional 
ideal p¢. 

Let 1, C2, + , Cx be the P-residues of &, &, - - , & respectively (c;GA) 
and let de, ds, - - , be the P’-residues of t2/t:, ts/ti, , tr—e/ti(ds EA’). 
The element d; will be the root of an irreducible polynomial f;(z) with coeffi- 
cients in A. We replace the coefficients of f;(z) by arbitrary but fixed elements 
of o of which they are residues. Let F;(z) be the polynomial with coefficients 
in o thus obtained, +=2, 3, ---,r—s. If we assume that the polynomials f,(z) 
are all separable then we can conclude as in [11, p. 590] that the r elements 


are uniformizing parameters at P’. Since these elements include the uniform- 
izing parameters ¢, of I’ and the uniformizing parameter 4; 
of C’, the proof is complete. 

However, if some or all of the polynomials f;(z) are inseparable, then the 
elements (16) are no longer uniformizing parameters at P’ (compare with 
footnote 41). We shall therefore give here a new proof which applies both 
to the separable and non-separable case. We consider the residue class ring 
o*=0'/p’, where p’ is the prime o0’-ideal of I'’. Let 2, 23,--+-+, 2. be the 
p’-residues of t2/t:, ts/ti, , t-—s/t. Since = po and since we have shown 
earlier in this section that 22, 23, - + - , 2--. are algebraically independent over 
K, it follows from (14), for 4=1, that o* is a polynomial ring(**) over A: 


(**) This shows incidentally that the field of rational functions on I’ is a pure transcenden- 
tal extension of A, whence I’ és a rational variety over A. 


1943} BIRATIONAL CORRESPONDENCES 


o* = A[zs, 23, - , Spe}. 


The rest of the proof will be based on the following lemma: 


LemMA 9. In a polynomial ring P,=A [x1, Xn] over an arbitrary 
field A every prime zero-dimensional ideal possesses a base consisting of n ele- 
ments(*°). 


Proof of the lemma. Since the lemma is trivially true for m=1, we proceed 
by induction with respect to ”. Let p be a prime zero-dimensional ideal in P, 
and let f(z) be the irreducible polynomial in A[z] such that f(x,)=0(p). 
The residue class ring P*1=P,/f(x.) is obviously a polynomial ring 
A* [x1, x2, , over the field A*=A(a), where is a root of f(z). The 
ideal p*=p/f(x,) is prime and zero-dimensional in P,*;. By our in- 
duction, there exist »—1 elements in such that 
p* , Wats). Let wi, we, , be elements of P, whose 
residues modulo f(x,) are, respectively, wi*, ws", - - +, wets. Then it is clear 
that p= (w1, we, + , f(%a)), q-e.d.(“). 

We now apply our lemma. In the homomorphism 0’~o* the prime 0’-ideal 
pe of the point P’ is mapped upon a prime zero-dimensional o0*-ideal p¢*. By 
the lemma we have 


Let ¢/, be elements of 0’ whose p’-residues are respectively 
- +, Then we have 


(17) Po (p’, 


Let $i be the quotient ring of P’. Since Q(P’) DQ(P) = $ and since Q(P’) D0’, 
the ring $/ contains all the rings previously considered, that is, the rings 
o, and 3’ (3’=-0’). We have p’=o'po, whence -p’= -po 
SY F-po= SY (4, by, °°, ty) (4, ty) (since ts/ty SY ’ 
4=2,3,--++,27—5). Substituting into (17) we find: 


SY “po Si - (A, $2, $s, » 


(**) This lemma implies that every point P of an affine (or of a projective) space over A 
is simple. The lemma gives, however, a stronger result, since it shows that uniformizing pa- 
rameters at P can be soselected that they generate the prime ideal of P not only in Q(P) but also 
in the polynomial ring. In other words: every point P of an affine n-space is complete intersection 
of n hypersurfaces. This result can be extended without any difficulties to projective spaces by 
a similar inductive argument. 

(“) In the case of ground fields of characteristic zero we have used instead of the above 
lemma the following property of the polynomial ring P,: if f;(z) is the irreducible polynomial in 
A[s] such that f;(x;) =0(p), then the ideal (f,(x:), f2(x2), , fn(xn)) is the intersection of prime 
zero-dimensional ideals, one of which is of course the ideal p itself. In this case the m polynomials 
fs(xs) are uniformizing parameters for the point defined by p. This reasoning applies also in the 
case in which the polynomials f;(x;) are separable. 


541 


542 OSCAR ZARISKI 


This exhibits r uniformizing parameters at P’, and since these parameters 
include the uniformizing parameters of I’ and that of C’, the proof is now 
complete. 


CoroLuary. Jf all points of W are simple both for W and for V, then 
T[W]=T{W} (T—a monoidal transformation of center W) and T[W] is irre- 
ducible, (r —1)-dimensional and all its points are simple both for T[W] and V’. 
Moreover, if W is of dimension s, then T|W] is covered by an s-dimensional 
algebraic system {T’} of (r—s—1)-dimensional varieties I’ in (1, 1) correspond- 
ence with the points of W. Each TY’ 1s irreducible, rational and free from singulari- 
ties, and through each point of T|W] there passes a unique I’. 


REFERENCES 


1. H. Grell, Beztehungen zwischen den Idealen verschiedener Ringe, Math. Ann. vol. 97 
(1927). 

2. W. Krull, Idealtheorie, Ergebnisse der Mathematik und ihrer Grenzgebiete, IV 3. 

3. , Dimensionstheorie in Stellenringen, J. Reine Angew. Math., vol. 179 (1938). 

4. S. MacLane and O. F. G. Schilling, Zero-dimensional branches of rank one on algebraic 
varieties, Ann. of Math. (2) vol. 40 (1939). 

5. F. K. Schmidt, Uber die Erhaltung der Kettensitze der Idealtheorie bei beliebigen endlichen 
Kérpererweiterungen, Math. Zeit. vol. 41 (1936). 

6. B. L. van der Waerden, Algebraische Korrespondenzen und rationale Abbildungen, Math. 
Ann. vol. 110 (1934). 

7. O. Zariski, Some results in the arithmetic theory of algebraic varieties, Amer. J. Math. 
vol. 61 (1939). 

8. , The reduction of the singularities of an algebraic surface, Ann. of Math. vol. 40 
(1939). 

9. , Algebraic varieties over ground fields of characteristic zero, Amer. J. Math. vol. 62 
(1940). 

10. , Pencils on an algebraic variety and a new proof of a theorem of Bertini, Trans. 
Amer. Math. Soc. vol. 50 (1941). 

11. , A simplified proof for the resolution of singularities of an algebraic surface, 
Ann. of Math. (2) vol. 43 (1942). 

12. , Normal varieties and birational correspondences, Bull. Amer. Math. Soc. vol. 48 
(1942). 


THE Jouns Hopkins UNIVERSITY, 
BaLtrmmore, Mp. 


= 

} 


‘ 


